SlideShare a Scribd company logo
Journal of Modern Computer Networks
Cloud Based Datacenter Network Acceleration
Using FPGA for Data-Offloading
Kennedy Chinedu Okafora,1,2
, V. C. Chijindub,1
, G. C. Ononiwua,1
, and O. C. Nosiria,1
a
Dept.of Electrical and Electronics Engineering, Federal University of Technology Owerri, Nigeria; b
Dept. of Electronic Engineering, University of
Nigeria, Nsukka, Nigeria
Currently, the high-performance processors in Spine-Leaf,
Mesh, and Router layer-3 (SLMR-3) backend server do-
main have multiple cores, but data offloading from proces-
sor to the peripheral is not keeping pace with the required
Quality of Service (QoS) needed to balance the workload
on a Warehouse Scaled Computer (WSC) running a de-
veloped Enterprise Energy Tracking Analytic Cloud Por-
tal (EETACP) data center network. High speed with low
latency interconnects between the processors and Field
Programmable Gate Array (FPGA) is critical for achiev-
ing performance benefits in EETACP deployment. Most
of the servers in WSC architectures are running at aver-
age utilization rates and perform well under peak process-
ing power. These servers are good candidates for FPGA
processors in cloud-based data centers owing to its acceler-
ation coherency. This paper made a strong case for cloud-
based support for EETACP. An FPGA-based Spine-Leaf
model is proposed to be an alternative to traditional net-
work models for EETACP provisioning. The paper an-
alyzed reconfigurable FPGAs, characterized a simplified
process model for hyperscale FPGA cloud design descrip-
tion. To validate the performance, comparisons was made
with two similar networks, namely DCell and BCube for
enterprise application deployments. It was concluded that
FPGA-based DCN acceleration for EETACP offers accept-
able QoS expectations
FPGA System on Chip, Cloud Computing, Virtualization, VHDL, Net-
work Optimization, Quality of Service
1. Introduction
Cloud datacenters are designed and built for various high-
performance computing services such as office collaborative
tools (e.g, Microsoft Office 360, Google Drive), search engines
(e.g. Bing, Google), global stock market analysis, entertain-
ment (sports broadcasting, news mining, games, etc.), mecha-
tronics integrations and other scientific workloads [1, 2]. Today,
the servers in these datacenters are interconnected using ei-
ther Spine-Leaf, Mesh or Routed Layer-3 model (SLMR-3) [3].
Cloud application datacenter networks are large and usually
connect hundreds of thousands of servers via their lay-er-3
switch fabrics. A good data offloading strategy in the cloud
datacenter network is critical to ensure that servers, switch-
es, routers, load-balancers as well as its application do not
encounter deadly bandwidth bottlenecks due to utilization
and over-subscription. This will help to isolate services from
each other; and derive more relaxation in workload placement,
rather than having to streamline workload placement to where
bandwidth is available.
Besides, due to the rise of cloud computing integrations, low
latency and high throughput datacenter networking (DCN)
is now an important area of research. The current SLMR-3
for a typical EETACP deployment context is novel. With
cloud provisioning, the beneficial role of FPGA component in
datacenter acceleration vis-à-vis SLMR-3, becomes an inter-
esting and timely subject. Several contribution and studies on
DCN have not considered datacenter acceleration for Quality
of Service (QoS improvement. For instance, topology design
and routing are the focus in [4–8]. Architectural tiers are the
emphasis in [3], [3], [9]. Flow scheduling and congestion con-
trol are the consideration in [10], [11], [12]. Virtualization is
the focus in [13], [14] while application support is the focus in
[15], [16]. In all these studies, little attention has been given to
QoS performance using FPGA service processing cores. Since
cloud-based DCN is a relatively new exploration area in high-
performance networks, many of the designs discussed in [5],
[6], [8], [11], [14], and [16] have failed to carry out investigation
on DCN acceleration. Using Spine-Leaf FPGA network model
has so many benefits for high-performance market segments.
According to [17], and [18], a 4-way Layer-3 Leaf/Spine with
Equal Cost Multi-Path (ECMP) architectural processing for
routing and other computing services is the new template for
high-performance datacenter network designs. In cloud-based
scenario, the type of switch or even server processor cores can
contribute to congestion delays regardless of data offloading
strategies. For example, Portland [8], BCube and Quantized
Congestion Notification (QCN), [11] uses rate based conges-
tion control which is not efficient. Hence, current Ethernet
switches, IP routers, server etc, found in the existing data-
center architectures, therefore can not be used to implement
high-performance datacenter designs.
A high-end FPGA System on Chip (FSoC) could be em-
ployed for data offloading, leading to improved QoS for en-
terprise applications. There two types namely: the Static
Random Memory (SRAM) and the Antifuse versions. These
are semiconductor devices built on a matrix of Configurable
Logic Blocks (CLBs) connected via programmable intercon-
nects [19].
By construction, FPGA’s are efficient at executing a pre-
dictable workload. Given that datacenter workloads requires
high computational capabilities, energy efficiency and low cost,
a legacy commodity server cannot satisfy these demands. As
such, FSoC can be reprogrammed to offer flexible acceleration
The authors declare no conflict of interest
Academic Editor: Dr Mohamad Yusof Darus, Senior Lecturer, Faculty of
Computer and Mathematical Sciences, Universiti Teknologi MARA, Malaysia
1
All authors contributed equally to this work
2
Corresponding author email: kennedy.okafor@futo.edu.ng
JMCN | January 2017 | vol. 1 | no. 1 | 1–12
Journal of Modern Computer Networks
of workloads.
Till date, many cloud datacenters have not deployed FSoC
as compute accelerators. Hence, to implement efficient cloud
DCN designs, rich programmability is absolutely required in
the cloud DCN service processors beside the role of Type-1
bare-metal virtualization [20].
There are two approaches in this regard namely: Pure
software-based [21], [22] and FPGA-based programmability
such as NetFPGA [23]. Software-based systems can provide
full programmability while providing a reasonable packet for-
warding rate. Their performance is still not comparable to
commodity switch and server FPGA Application Specific Inte-
grated Circuits (ASICs). The batch processing used in existing
server switches and software-based switches yield optimization
that introduces high latency. This is critical for various control
plane functions such as signaling and congestion control [6],
[8], [11] in high performance networks.
Considering bandwidth intensive applications, FPGAs can
be designed for low-latency applications. This offers higher
value for cloud computing processes. Since FPGA-based sys-
tems are fully programmable [24], a datacenter backend can
be optimized though in-circuit re-configuration at power-up to
support more functions and achieve seamless data-offloading.
Hence, the latest trend in the server performance is the data
offloading paradigm. It involves pairing an x86 processor with
an FPGA device architecture which is highly customizable.
With this method, workload performance can be enhanced
while accommodating changing needs in the future. Clearly,
a data-offloading FSoC will improve the throughput of cloud
based Software as a Service (SaaS) by co-processing with a
commodity CPU. This same concept can accelerate cloud
database searches for improved performance. The major trade-
off for acceleration (cloud workload offloading in this case) is
that frequent or repetitive tasks or task sequences will affect
power demand.
As far as this work is concerned, little research has been
carried out in literature as per investigating the QoS effects
of cloud network server, router, etc., driven by FPGA cores.
Hence, there is the need to explore FPGA target device archi-
tecture in developing DCCNs for cloud-based services (such the
Enterprise Energy Tracking Analytic Cloud Portal, EETACP,
e.g., databases, big data analytics and high-performance com-
puting).
2. Related Works
2.1 Cloud Datacenter Networking
Traditional datacenter network architectures such as DCCN
[3], Portland [8], DCell [25], BCube [26], R-DCN [27], He-
lios [28], c-Through [29], etc., have been extensively studied.
Most of them uses a recursive scheme for scalability and per-
formance while others construct a separate optical network
with an expensive high port-count 3D MEMS side by side
with the existing datacenter to add core bandwidth on the
fly. Most DCNs like OmniSwitch which is modular data-
center network architecture, integrates small optical circuit
switches with Ethernet switches to provide both topological
flexibility and large-scale connectivity. These architectures
can be re-modified using the enhanced Spine-Leaf, mesh, and
router layer-3 (tier-2) models running on a low latency FPGA
core. This has not been used in server-centric application
deployments strategy. The author in [30] highlighted issues
affecting existing commercial off-the-shelf Ethernet switches
for these architectures at a high link speed, such as 10gigabits
per second (Gbps). The challenges include:
(a) Extreme complexities, particularly the switch software,
wiring and scaled troubleshooting.
(b) Availability of various failure modes in the absence of
fail-over schemes.
(c) Existing large commercial switches and routers are expen-
sive.
(d) Some datacenters require high port density at the aggre-
gate or datacenter level switches at extremely high link
bandwidth.
(e) Other issues are over-subscription, microburst detec-
tion problems using SNMP polling for TCP sprawl (i.e.
many–to-one traffic pattern), high queuing latency, an ab-
sence of mobility support for virtual server infrastructure,
poor scalability, and inflexibility resulting from legacy
designs that has compatibility issues with automated vir-
tualized datacenter.
Therefore, many researchers have kept on evolving data-
center network architectures, with most of them focusing on
the novel design philosophy of Spine-Leaf, mesh, and router
layer 3 models [31],[32].
The new trend in datacenter network model is to address
the issues of optimal performance such as low latency, availabil-
ity/fault tolerance, utilization, energy efficiency and scheduling
of resources regardless of the network device.
Regarding architectural design framework, the most related
work in this research is Datacenter-in-a-Box at Low cost (DIA-
BLO) FPGA cluster prototype in [30]. The authors discussed
a novel cost-efficient evaluation methodology. FPGAs were
used, but treated datacenters as whole computers with tightly
integrated hardware and software. The work enumerated three
viz: i. Server Models: Built on top of RAMP Gold: SPARC V8
ISA, running on full Linux 3.5 with a fixed CPI timing model.
ii. Switch Models: Based on circuit and packet switching with
abstracted models focusing on switch buffer configurations. iii.
NIC Models: Having a scatter/gather DMA with zero copy
drivers as well as a NAPI polling support.
In integrating the cloud DCN nodes to FPGA cores, Figure
1 illustrates a high-level structure. The system used 6BEES
boards having 24Xilinx-Virtex 5FPGAs [30]. The simulation
was realized with 3072 servers in ninty-six racks. The network
switches were used at 8.4B instruction/second. The validation
was on a single rack physical system with sixteen node cluster,
3GHz Xeon + 16 port Asante IntraCore 35516-T switch. The
physical hardware setup had two servers + 1 to 14 clients. The
software configurations included server protocols: TCP/UDP,
server worker threads: 4(default), eight simulated server: sin-
gle-core with 4GHz fixed CPI.
Figure 2 shows type 1 DIABLO without inter-board connec-
tions and type 2 DIABLO with fully-connected with high-speed
cables. Type 2 shares similar feature with this work. With
2 | Okafor et al.
Journal of Modern Computer Networks
Fig. 1. DIABLO cluster physical mapping [33]
FPGA and use of programmable hardware platforms, the sim-
plification of the load on cloud nodes and network devices will
enhance performance. As such, a cloud of general-purpose
re-sources (FPGA) was used to offload the processed tasks.
Andrew P. [34] in their work, described a reconfigurable
fabric (FPGA Catapult) designed to balance some performance
concerns. The system was embedded into each half-rack of
48 servers in the form of a small board with a medium sized
FPGA and local DRAM attached to each server. As depicted
in Figure 2, FPGAs are directly wired to each other in a 6x8
two-dimensional torus, allowing services to allocate groups
of FPGAs to provide the necessary area to implement the
desired functionality. The work was evaluated by offloading
a significant fraction of Microsoft Bing’s ranking stack onto
groups of eight FPGAs to support each instance of this service
[34]. Based on performance expectations of an earlier proposed
EETACP (cloud application deployed on DCCN), the key goals
for any datacenter architecture includes [9]:
(a) Deterministic latency
(b) Redundancy/high availability
(c) Manageability/flexibility
(d) Excellent resource allocation and scheduling
(e) Scalability and fault tolerance
An improved network architecture based on FPGA fabric
is proposed to achieve those above. This model has been
shown to be better than the Spine-Leaf model, mesh and layer
3-routed models owing to the performance characteristics of
this de-vice. It supports lower latency, offloading, seamless
integration and computing scalability. It is very imperative
to outline the advantages and disadvantages of the current
Spine-Leaf, mesh and Layer 3-routed network design. This is
shown in Table 1.
In EETACP DCCN [34], a low latency and fault tolerance
network was achieved. In this case, the number of network
tiers was to be reduced to minimize system latency. But, an
FPGA based fabric structure simplifies management, reduces
cost, and allows resilient and low-latency networks to be de-
signed just like the Spine-leaf model. The robust architectural
concepts supported in the DCCN architectures provide high
availability, deterministic low latency and can scale up or down
with demand. EETACP was tightly integrated with the Om-
niVista™ 2500 Virtual Machine Manager (VMM), providing a
unified platform for virtual machine visibility and provisioning
with virtual network profile across the network. These allow
seamless server interfacing.
By introducing FPGA cluster in the above architectures, its
advantages in cloud datacenter networks (e.g., DCCN) include:
(a) Allows multi-chassis terminated link aggregation groups
to be created.
(b) Creates a loop-free edge without Spanning Tree Protocol
(STP).
(c) Provides node-and link-level redundancy particularly with
the Integrated Service OpenFlow load balancer.
(d) Enables overall architecture to be geo-independent i.e. no
co-location support.
(e) Active support for Inter-connect switches using standard
10G and 40G Ethernet optics.
(f) Supports redundancy and resiliency across the switches
connecting EETACP servers.
In Web-scale data centers, by boosting performance with a
few FPGA device architecture across thousands of servers, this
will save cost. Besides, by leveraging FPGAs for acceleration in
Spine-Leaf models, this will improve dynamic over-allocation
(change management for large-scale data centers, because en-
terprise tools must track the FPGA algorithm as it is updated.
This is needful for enterprise adoption. With the availability
of server virtualization, a hyper-scale datacenter could use
FPGA capabilities. This paper opines that new processor
architectures based on a programmable FPGA-device have
several advantages to cloud service provisioning. It allows for
scalability on demand and loosely coupled system designs. R.
Joost, & Salomon [35] showed that FPGAs are best suited for
Okafor et al. JMCN | January 2017 | vol. 1 | no. 1 | 3
Journal of Modern Computer Networks
Fig. 2. DIABLO cluster prototype with 6 BEE3 boards[30]
Table 1. Advantages and Disadvantages of the current Spine/Leaf, Mesh, and Layer 3-Routed Network Design
Model Advantages Disadvantages
Spine-Leaf
Model • Offers layer 2/3 common fabric implementation
• Facilitates simpler design
• Fewer interconnects
• Easy to scale within boundary and better la-
tency transition
• Additional layer of transit hop may
impact latency and over subscrip-
tion
• Scalability limited to number of
ports in the spine layer
Mesh- Model
• Offers layer 2/3 differentiated fabric implemen-
tation
• Implementation highly scalable
• No transit hop
• Lower latency and lower over-subscription ra-
tions
More links used for interconnects
Lay-3 Routed
Model • Offers end to end routed fabric implementation
• Easy to secure in IP layer
• Fewer interconnects
• Easy to scale
• Highly oversubscribed architecture
• Number of transit hops is not deter-
ministic, impacting latency
• Complex design and maintenance
4 | Okafor et al.
Journal of Modern Computer Networks
tackling most industrial and network based applications, such
as supervisory control systems, cloud computing, Internet of
things, and other grid computing services. It is shown that FP-
GAs are very powerful, relatively inexpensive, and adaptable.
This is because their configuration is specified in an abstract
hardware description language. FPGA-based implementations
combines many advantages such as rapid development cycles,
high flexibility, re-usability, moderate costs, easy upgrading
(due to the usage of abstract Hardware Description Languages
(HDLs), and feature extension (as long as the FPGA is not
exhausted). For the network pieces in the cloud DCN, the
FPGA core on the servers, switches, and load balancers, are
managed by a management console in the form of Software
Defined Network (SDN) that separates the data, control and
application layer planes. In context, for updating a switching
policy, the network is initially mapped in the design thereby
maintaining a default state and eliminating routine reprogram-
ming of the FPGA logic cells. However, the use of FPGA
can complement other chipset-accelerators (i.e., GPUs) but at
the expense of writing new procedure in VHDL. The issues of
power consumption and area-on-chip are vital for performance
considering the requirement of FPGA cores needed in the
network. This is a trade-off for future research.
3. Methodology
In this section, FPGA modular description is presented. A
characterization scenario was used as a basis for generaliza-
tion. To achieve this, an Electronic Design Simulation tool
(Riverbed Modeled) with extended C++ library was employed
in this study. Due consideration was made on FPGA Virtex
UltraScale driven server machine. This was used for the Spine-
Leaf DCCN design as it offers efficient performance, good
system integration, and bandwidth with the added benefits
of re-programmability. In the enterprise setup, the periph-
eral controllers include general purpose I/O, UART, Timer,
Debug, SPI, DMA Controller, Ethernet (interface to exter-
nal MAC/PHY chip). Also, the memory controllers include
SRAM, Flash, SDRAM, and DDR SDRAM.
In context, the scalability of the Virtex UltraScale VU440
device is made possible by its ASIC-class architecture – for up
to 90% utilization featuring next generation routing, ASIC-like
clocking, resource utilization, power management, and elimi-
nation of interconnect bottlenecks, and critical path optimiza-
tions. Its key architectural blocks include wider multipliers,
high-speed memory cascading, 33G capable transceivers, and
the addition of industry-leading integrated 100Gb/s Ethernet
MAC and 150Gb/s IP cores. These devices enable multi-
hundred gigabit per second levels of system performance with
smart processing at full line rates. Figure 3 shows a proof of
concept demonstrating the initial testbed setup for EETACP
deployment. The configuration facilitates the dual-housing of
servers/storage and access devices with links distributed across
the DCCN switches. There is no logical loop between the edge
devices and multi-chassis peer switches, even though a physical
loop exists. Single interface servers, storage and edge devices
can be connected to any DCCN switch via a virtualization
management console. The setup is based on general purpose
processor. Using the Type-1 bare metal virtualization offers
the feasibility of VM instances which supports failovers, repli-
cation, and redundancy for a production environment. The
assumption in this research is that the FPGA concept as well
as the Type-1 bare metal virtualization must be integrated in
a case of myriads of servers, i.e., massively scaled datacenter
to derive the expected QoS.
An FPGA scalable architecture [36] offers a template for
adoption in DCCN. Specifically, a Xilinx FPGA comparison
show-ing an optimal configuration for Virtex UltraScale device
has been enumerated in [19]. In the work, the logic Cells (K),
Ul-traRAM (Mb), Block RAM (Mb), DSP Slices, Transceiver
Count, Maximum Transceiver Speed (Gb/s), Total Transceiver
Bandwidth (full duplex) (Gb/s), Memory Interface (DDR3),
Memory Interface (DDR4), PCI Express, Configuration AES,
I/O Pins and I/O Voltages were all compared against other
device architecture variants showing VirtexUltraScale device as
the most preferred choice. This further facilitated its adoption
in the proposed DCN design in section 3. The FPGA-based
system implementations have the following characteristics:
(a) Allows for the integration of soft-core processors.
(b) Have plenty of logic resources for routing
(c) Have plenty of RAM supports. This observation combined
with the lack of bypass path, led to a multi-threaded
design of large modules.
In the validation analysis, this work will focus on FPGA-based
datacenters for performance benchmarking. It must be stat-ed
that congestion offloading is derived through the use of over-
allocation considered in Figure 3. In this paper, the prototype
design of the cloud based datacenter has only been tested
with a very small testbed running realistic micro-benchmarks
for cloud computing services. The emphasis is on the QoS
comparison with related datacenter cores. The role of Type-1
virtualization as a DCN accelerator is presented in Section 3.1
3.1 Architectural Model (Type 1-Virtualization
The goal of the FPGA-based network server model (DCCN) is
to have a credible workload generation that is scaleable, and
efficient with respect to QoS for a congested traffic pool. A
highly-accurate framework for the cloud computing workload
was developed in Figure 4. At the core, the server clusters
must be capable of running complex server application software
with minimum modifications. In context, an FPGA service
model is responsible for executing the target procedure (router,
switch or server CPU) correctly as well maintaining the device
architectural state in congested networks. By using the Type
-1 virtualization strategy, this is made feasible. The benefits
of this management scheme include:
(a) Simplified mapping of the functional FPGA model. The
separation allows complex operations to take multiple
host cycles. For example, a highly-ported register file can
be mapped to a block RAM and accessed in multiple host
cycles, avoiding a large, slow mapping to FPGA registers,
multiplexers, etc.
(b) Improved flexibility and reuse of resources even in over-
allocation mode. With it, precise server timing model
can be changed without modifying the overall network
model. This will improve efficiency. For instance, it is
Okafor et al. JMCN | January 2017 | vol. 1 | no. 1 | 5
Journal of Modern Computer Networks
Fig. 3. DCCN EETACP server testbed (Kswitche Labs, 2015)
possible to use the same VM switch model to simulate
both 10Gbps switches and 100Gbps switch-es just by
changing the timing model only.
(c) It enables a highly-configurable abstracted timing model.
In the virtualized datacenter, splitting timing function
allows the timing model to effect abstraction in the cloud
layer. When looking closely at the FPGA characteristics
for network architectures, this work identified a wide vari-
ety of design choices such as switch architecture, network
topology, protocols, and applications.
To support data-intensive processing in an FPGA-based do-
main, the traffic workloads must be optimized in the cloud
environment. As such, optimization via data management
must be satisfied for enhanced QoS.
3.2 FPGA Cloud Datacenter Specifications
This paper used the specification of the cloud datacenter
network described in [37]. However, using a Type-1 server
virtualization is considered for resource management in an
FPGA driven DCCN. The network fabrics have OpenFlow
load-balancer, virtual gateway, server instances on the hy-
pervisor. In the network, a three-stage Clos topology using
Nexus 7000 (spine) and Nexus 3000 platform (leaf) with FPGA
based N-Servers connected to them, forms a Warehouse-scale
cloud dat-acenter. These runs on 10-40Gbps links. These
specifications are encapsulated in Figure 4.
3.3 Hyper-Scale Cloud Cluster Server (HCCS)
The FPGA card used in the Spine-Leaf cloud network server
(as shown in Figure 4 is depicted in Figure 5. This is based
on Xilinx Virtex UltraScale FPGA technology, (i.e., target
device). The characterization in the HCCS is mainly for the
Spine-Leaf DCCN.
For data-offloading at the server core, this prototype FPGA
accelerator card has six Virtex-6 FPGAs linked together by
a PCI-Express switch from PLX Technology. Three of them
are fixed into a Supermicro SuperServer designed to accommo-
date three Tesla GPU coprocessors in DCCN. This has a pair
of six-core Xeon 5600 class processors as shown in Figure 5.
This processor core is depicted Figure 3 while Figure 6 shows
the logical placement. In this case, the server machine has
24 half-wide grid sockets. This pattern allows the X86 server
processor (grouped into two) to fit into the testebed rack
enclosure. On the server, the FPGA co-processor in Figure 6
has eight lanes mapping to Mini-SAS xSFF-8088 connectors,
with two ports on each FPGA card. This is to speed up data
cycling and improve utilization cycles of the CPU.
The server has enough space for PCI-Express 3.0 periph-
eral card situated at the back of the server sled. It has two
eight-core Xeon processors running at 2.1GHz CPUs with
64GB of main memory (DRAM). For its storage capacity,
four 2TB disk drives (4HDDs-2TB) and two 512 GB solid
state disks (2SSDs) was introduced. The server node has
a 10 Gb/sec Ethernet port with redundant power supplies.
Wireless connectivity via the bay ports is at default in the
DCCN. The DCCN server FSoC accelerator card configured
in a production setup is distributed across server cluster infras-
tructure. In the deployment context for HCCS-DCCN, two
sets of cables are used to implement a ring connection with
six xSFF-8088 connectors. Also, eight connectors are used
on a ring for duplication/redundancy. With the six-adapter
cables, the six FPGA cards (in six adjacent server nodes in
the server chassis) are mapped to each other with one set
of Mini-SAS ports. The complex arrangement allows eight
different groups of FSoC-nodes in a 48-node pod to be self-
linked using eight adapter cables. During its operation, the
FSoC run at 10Gb/sec even for all the Ethernet connected
interfaces. Figure 7 shows Virtex UltraScale VU440 device
used for service processing cores. This provides the highest
system perfor-mance and bandwidth for large-scale computing.
This is very good for a typical server scenario in Figure 3.
3.4 HCCS FSoC Data-Offloading Algorithm
Algorithm I describes the server interconnection read and write
operations with FPGA data offloading. Firstly, after defining
the server configuration with its virtualization mappings, a
10GB link is used for current link interconnection in the cluster
subnet. An array of user input jobs through a load balancer
Lm for none zero term is defined for the server. To facilitate
read operation from the server, the variable controls (a, N, i, j)
are used to execute successive read operations in matrix form.
6 | Okafor et al.
Journal of Modern Computer Networks
Fig. 4. Cloud Computing Spine-Leaf Cluster (DCCN)
Fig. 5. A Typical Cloud FPGA Accelerator Network Card [38]
Fig. 6. A modified logical Interfacing in DCCN Subnet Cluster[38]
Okafor et al. JMCN | January 2017 | vol. 1 | no. 1 | 7
Journal of Modern Computer Networks
Fig. 7. An FPGA based Virtex UltraScale VU440 device architecture for
cloud Server board
To complete successfully, j control checks for equal availability
of server CPUs and their VMs. The first step to job processing
is to select the shortest path (i.e., the one with the highest
throughput) between the user job request and the server VM
in the HCCS. As the workload increases, more bandwidth
is over-allocated by the hypervisor virtual machine monitor
(VMM). This translates into an increased throughput along
the path. All the processed workloads are returned through
the shortest-path to end-users and the cycles re-initializes and
repeats the read and write operations.
Using Algorithm 1, the EDA study was used to explore
the capabilities of Virtex UltraScale FSoC for DCCN data-
offloading. In study, the RAMs on FSoC are used to store
the simulation thread state. This dynamically switches the
threads in order to keep the data pipelines saturated. This is
memory strategy is called HCCS host-multithreading for low
latency data-offloading. The benefits are summarized below.
• Availability of hard-wired DSP blocks with execution
units, especially Floating Point Units (FPUs). This domi-
nates Look Up Tables (LUT) resource consumption. The
implication is that by mapping functional units to DSP
blocks rather than just LUTs, more resources are reserved
to execution timing.
• DRAM accesses are relatively fast on FSoC. The logic
in FSoC often runs slower than DRAM because of on-
chip routing delays. This insight will greatly simplify
host memory system, as large, associative caches are not
needed for high performance. The tradeoff between the
QoS performance and FPGA compute resources is the
overall server cost budget parameter.
4. Simulation Validation
4.1 Experimental Design Description
First, FPGA server process model was built for DCCN VM
clusters. This was realized using Riverbed modeller academic
edition 17.5 with its C++ libraries1
as an EDA tool. The
implementation was on a heavily modified host-cache design.
The server model supports full 32-bit OS. At the core, the
Virtex UltraScale was emulated into the service processors
shown in Figure 4. In the real setup (depicted in Figure 4),
the components introduced includes the server farm virtual
1
https://splash.riverbed.com/community/product-
lines/steelcentral/university-support-center/blog/2014/06/11/riverbed-
modeler-academic-edition-release
firewall router SFV, emu-lated OpenFlow controller OC, ap-
plication and profile configuration windows. This test center
configuration sets-up the Web, Database, FTP, and Exchange
servers, such as DCCN server1, server 2, server 3, server 4,
server 5,. . . .. . . .N and six location with active users. The
system servers run on Virtex Ultra Scale FPGA target de-
vice. With Type-1 virtualization, servers are placed on the
DCCN as VM clusters. The VM connect user tasks to the
HCCS which processes services concurrently. The application
(HTTP service) runs on the OpenFlow controller whose job is
to dispatch the requests to the server clusters.
This facilitates resource allocation, scheduling and load
balancing in the DCCN. The simulation experiments were
per-formed on an emulated cloud, at the IaaS level, using the
datacenters cardinality theory. For the DCCN VM clusters,
two physical servers (2X-8-core Xeon 2.1 GHz CPUs, 64GB
DRAM, 4HDDs-2TB, 2SSDs-512GB, 10GB Ethernet, with
Linux & Mac OS) were configured to run on the CPU model.
The VM instances were created according to the workloads
per site. For acceleration, Type-1 full/active virtualization,
failover and over-allocation were simultaneously enabled to
address the is-sues highlighted in Section 2.1. The process
model experimental methodology considered four key metrics:
Service process latency, throughput, resource availability and
resource utilization. The execution time is measured using
the timer functions provided by the C++ trace file diagnostic
library. The throughput is determined at destination as a ratio
between the amount of data sent from users and the service
processing time. Finally, of the metric is computed based on
riverbed frame-work/simulation for DCCN, DCell, and BCube.
Each QoS metric was reported in the plots discussed below.
4.2 Performance Evaluation
After setting up three distinct network scenarios (DCell and
BCube) alongside the FPGA-based DCCN, a focused discus-
sion on its services as well as its performance were analyzed in
a previous study [38]. The first scenario measures the improve-
ments brought to I/O-intensive FPGA applications from ser-
vice process throughput perspective. By adaptively switching
end users from the leaf to the spine models, the servers read
and process request concurrently. This occurs within the data
center management which replicates processed multicast jobs
and transfers them in a pipeline fashion within the deploy-
ment. This paper presents a set of results obtained from a QoS
comparison among the three networks using the remote cloud
storage. This host services such as FTP, database/storage,
etc.
4.3 Analysis of Non-FPGA Cloud DCNs
The experiment in context focused on the comparative analy-
sis of three distinct DCNs viz: Spine-Leaf DCCN (proposed),
DCell, and BCube for network throughput, resource availabil-
ity and resource utilization. These networks were configured
using a scenario based approach. In this case, the cloud
computing application workload is homogenous. A suitable
frame-work used to evaluate the impact of FPGA acceleration
on the cloud datacenter is the MapReduce [39]. This makes
cloud-based computation flexible though with its performance
thread-offs in the cloud. This work used emulated cached
MapReduce engine [40], a general-purpose workflow engine
8 | Okafor et al.
Journal of Modern Computer Networks
Algorithm I: DCCN Server Read/Write Operations
Procedure FPGA Dataoffloading, Read/Write // the idea is to use FPGA to carry out data offload via read and write operation
Define DCCN-Server I/O // A distributed cloud computing server must have a well defined inputs and outputs
Program Server matrix (Input, Output)
ConstMaxS = 𝑆 𝑛+1; // recursive server chain ensures that the servers redundancies are maintained
While j ≤ 𝐾 do
𝑆. 𝑉𝑚 = 𝑉 𝑚+1; // recursive server virtual instances for internal server
//resources (I/O, RAM, CPU, etc)
FPGA acceleration = shortestpathjoboffload // initialization
Set link = 10GB // interconnection links // initialization
If Var = Var +1, then Sort with FSoC // the Var is used to allocate memory spaces on the CPU for read/write operation provided
they are not used up by the CPU
Var 𝑃0, 𝑄1, 𝑅1, 𝑁𝑘+1: Array [0……………MaxN, 0…………MaxN] of real nonzero term;
a,N,i,j: integer; // a = security term , (N, i ,j = control loop variables)
end if
end while
Begin // read operation
Readln (N); // Read user jobs from CPU
While i ≤ 𝑗 do
For i:= 0 to N-1 do for j;= 0 to N-1 do read (P[i]);// this now implements read job/task request
For i := 0 to N-1 do for j;= 0 to N-1 do read (Q[i]);
For i:= 0 to N-1 do for j;= 0 to N-1 do read (R[i]);
For i:= 0 to N-1 do for j;= 0 to N-1 do read (𝑁𝑘+1[i]);
For i:= 0 to N-1 do for j;= 0 to N-1 do read r[i]: = P[i] + Q[i] + R[i]………𝑁𝑘+1[i]);
For i:= 0 to N-1 do for j;= 0 to N do
If 𝑗 = 𝑆 𝑛+1 then
// Get the job request threads with maximum throughput
While j: i+1 to N do
If a [j] <≠ a [MinSec] then 𝑆 𝑛+1 ≠ 1
Dataoffload > = 0 // server CPU
else
Return;
end if
end while
end if
Transferjob.shortestpath = NextPath // recursive CPU server chain ensures that workloads are efficiently transferred using the
shortest path.
end while
end procedure
Fig. 8. HCCS-DCCN Read/write Algorithm
Fig. 9. Throughput Stability Response
Okafor et al. JMCN | January 2017 | vol. 1 | no. 1 | 9
Journal of Modern Computer Networks
Fig. 10. Cloud Server Utilization Response
[41] to run trace file statistics. For the three scenarios, the
number of mappers (32 MB per job), data sizes (1024MB) and
reducers (3) was maintained in all cases. From Figure 9, it
was observed that the proposed FSoC-DCCN had relatively
a better throughput with optimal virtual instance allocation
coordinator. In this regard, the average throughput stability
responses for DCCN, DCell and BCube are 40.00%, 33.33%,
and 26.67% respectively.
With Type-1 virtualization of Spine-Leaf DCCN server
cores alongside with the FSoC acceleration, relative perfor-
mance is feasible. Scientific workflows running in large, geo-
graphically distributed and highly dynamic computing environ-
ments can efficiently use FSOC-DCCN. This is because FSoC
based platforms can effectively satisfy throughput stability
requirement in a production deployment. From Figure 10,
resource availability refers to the ability to access the FSoC-
DCCN server clusters on demand while completing the job
requests. The complexity of cloud datacenters architecture,
its overall infrastructure makes resource utilization an-other
important parameter. It was observed that the proposed FSoC-
DCCN offered better resource utilization (for the work-loads)
compared with BCube and DCell scenarios.
When all existing resources in FSoC-DCCN server clusters
are used up by means of over allocation, additional resources
can be reserved for high priority jobs that arrive. In context,
when a job arrives, the availability of the VM is guaranteed.
The issue will be on the availability of resources to execute the
jobs. If the VM is available, then job is allowed to run on the
VM via dynamic allocation considering the network density.
This occurs only for Type-I virtualization on the cloud DCN
Spine-Leaf model. It was shown that the proposed scheme had
about 58.06% resource utilization (i.e. when logically isolated
with FPGA device cores) while the others offered 38.71%
(BCube) and 3.23% (DCell) respectively (ie. when not logically
isolated with FPGA cores). The implication is that FPGA-
based DCCNs will offload tasks from server processors more
frequently than other accelerator options since cloud service
processing rates are high. It also implies that the proposed
model will offer fairly good resource availability leading to
enhanced performance. This makes it more attractive in Hyper-
scale datacenters for Warehouse Scaled Computers (WSC).
Hence, Vm based cloud networks particularly in the cell based
and Spine-Leaf WSC can benefit from this advantage.
From the plots in Figure 9 and Figure 10, network infrastruc-
ture that processes bandwidth intensive applications will scale
optimally with FSoC. This is because, a key potential benefit
of the integrated processor and FPGA system is the ability to
boost system performance by accelerating compute-intensive
functions in FPGA logic (i.e. hardware acceleration and cache
coherency) while making more resources available. The proces-
sor performance is improved by the FSOC co-processing roles,
particularly from computing cyclic-redundancy check (CRC)
to offloading the entire TCP/IP stack. When the FPGA-based
accelerator produces a new result, the data needs to be passed
back to the processor as quickly as possible, so that the proces-
sor can update its view of the data. As a validation, a network
case of 1632 servers with FPGAs running an enterprise search
service on the web was analyzed with 11. This shows an im-
proved throughput with FPGA acceleration compared with
the case without FPGA acceleration in terms query latency
responses.
Another key benefit of integrating a General Purpose Pro-
cessor (GPP) and FPGA into a single real estate is the abil-
ity to ac-celerate system performance by offloading critical
functions to the FPGA. Transferring the data quickly and
coherently is key to realizing performance boost in cloud-
based networks. Datacenter network optimization with FPGA
acceleration im-proves bandwidth efficiency while satisfying
QoS metrics. Using any network equipment embedded with
FPGA processors, this would eradi-cate various performance
bottlenecks that software driven processors cannot overcome.
Smart computing and intelligence ap-plications having massive
workload will benefit from this alternative.
5. Conclusion
This paper has presented a super-scalar cloud datacenter net-
work built with FPGA core support. This offers excellent
throughput, low latency and good resource utilization when
compared with DCell and BCube datacenter networks. Hence,
offloading key functions from the processor to the FPGA can
result in substantial improvement in the system performance
while reducing system power drain. As observed in existing
Warehouse Scaled Computing (WSCs), high speeds with low
latency interconnect between the processors and FSOC are
10 | Okafor et al.
Journal of Modern Computer Networks
Fig. 11. FPGA Query latency Behavior[38]
necessary for optimal performance. The proposed datacen-
ter net-work offers memory coherency through the use of the
FPGA acceleration coherency. With this, issues of bandwidth,
performance, integration, and power requirements are fixed.
In highly dynamic environments, various types of computing
work-loads (such as databases, big data analytics, and high
performance computing, performance) can be improved us-
ing the proposed FPGA acceleration in Spine-Leaf datacenter
model. As more and more workloads are being deployed in the
cloud, it is appropriate to consider how to make FPGAs and
their capabilities available in the cloud. Hence, the proposed
system offers a low latency path from the network interface
to the consuming process, irrespective of network workloads.
As a proof of concept and validation, a micro testbed setup
on real life datacenter was explored. The work used DCCN
to model a datacenter Spine-Leaf architecture running traffic
patterns sampled from the Riverbed application engine on top
of Linux-KVM and Virtex Ultra Scale FPGA target device.
This enables isolation between multiple processes in multiple
VMs such as accurate acceleration, resource allocation, and
priority-based workload scheduling for QoS. The results from
the FPGA DCCN offloading strategy in Spine-Leaf designs
show that Type-1 virtualization influences re-source alloca-
tion and scheduling. With FPGA acceleration, performance
of cloud computing systems particularly in QoS contexts is
enhanced. Consequently, newer processors can use FPGAs to
accelerate applications (workload optimization). Furthermore,
with WSC (FPGA based servers), the Central Processing
Unit (CPU) of Spine-Leaf topologies can easily offload tasks to
FPGA device architectures for hardware acceleration. The con-
clusion is that with global deployment of FPGA-based cloud
datacenters, this will enable large-scale scientific workflows
to improve performance and deliver fast responses re-grading
QOS. Future work will focus on mathematical modeling and
state analysis of Markovian queue on heterogeneous FPGA
cloud based servers and their working vacation. The work will
investigate on power drain on high-density network, the chip
area, comparison with other GPUs /accelerators.
ACKNOWLEDGMENTS. We wish to specially thank
Cloud Computing and Distributed Systems (CLOUDS) Laboratory
at the University of Melbourne Australia; Department of Electronic
Engineering, UNN; Center for Basic Space Science, UNN; Energy
Commission of Nigeria,-NCERD-UNN, and National Agency for
Science and Engineering Infrastructure (NASENI) for their immense
support in course of this research work.
References
1. Microsoft (2016) Azure successful stories, online :
http://www.windowsazure.com/en-us/case-studies/archive/.
2. Lu G et al. (2011) Serverswitch: A programmable and high
performance platform for data center networks in In NSDI’11
Proc. of 8th USENIX conference on Networked systems design
and implementation.
3. Okafor K, Ugwoke F, Obayi AA, Chijindu V, Oparaku O
(2016) Analysis of cloud network management using resource
allocation and task scheduling services. International Journal
of Advanced Computer Science & Applications 1(7):375–386.
4. Guo C et al. (2008) Dcell: a scalable and fault-tolerant net-
work structure for data centers. ACM SIGCOMM Computer
Communication Review 38(4):75–86.
5. Al-Fares M, Loukissas A, Vahdat A (2008) A scalable, com-
modity data center network architecture. SIGCOMM Comput.
Commun. Rev. 38(4):63–74.
6. Guo C et al. (2009) Bcube: a high performance, server-centric
network architecture for modular data centers. ACM SIG-
COMM Computer Communication Review 39(4):63–74.
7. Greenberg A et al. (2009) Vl2: a scalable and flexible data
center network. ACM SIGCOMM computer communication
review 39(4):51–62.
8. Niranjan Mysore R et al. (2009) Portland: A scalable fault-
tolerant layer 2 data center network fabric. SIGCOMM Com-
put. Commun. Rev. 39(4):39–50.
9. D KC (2016) Ph.D. thesis (University of Nigeria Nsukka).
10. Okafor K, Nwaodo T (2012) A synthesis vlan approach to
congestion management in datacenter internet networks. Inter-
national Journal of Electronics and Telecommunication System
Research 5(6):86–92.
11. Alizadeh M et al. (2008) Data center transport mechanisms:
Congestion control theory and ieee standardization in Commu-
nication, Control, and Computing, 2008 46th Annual Allerton
Conference on. (IEEE), pp. 1270–1277.
12. Al-Fares M, Radhakrishnan S, Raghavan B, Huang N, Vahdat
A (2010) Hedera: Dynamic flow scheduling for data center
networks. in NSDI. Vol. 10, pp. 19–19.
13. Wood T (2011) Ph.D. thesis (University of Massachusetts
Amherst).
14. Guo C et al. (2010) Secondnet: a data center network virtual-
ization architecture with bandwidth guarantees in Proceedings
of the 6th International COnference. (ACM), p. 15.
15. Shieh A, Kandula S, Sirer EG (2010) Sidecar: building
Okafor et al. JMCN | January 2017 | vol. 1 | no. 1 | 11
Journal of Modern Computer Networks
programmable datacenter networks without programmable
switches in Proceedings of the 9th ACM SIGCOMM Workshop
on Hot Topics in Networks. (ACM), p. 21.
16. Abu-Libdeh H, Costa P, Rowstron A, O’Shea G, Donnelly
A (2010) Symbiotic routing in future data centers. ACM
SIGCOMM Computer Communication Review 40(4):51–62.
17. Arsita (2015) Arista universal cloud network white paper
(https://www.arista.com).
18. Cisco (2013) Cisco fabric path technology and de-
sign brkdct-2081 (http://www.valleytalk.org/wp-
content/uploads/2013/08/BRKDCT-2081-Cisco-FabricPath-
Technology-and-Design.pdf).
19. Xilinx (2016) Field programmable gate array (fpga)
(https://www.xilinx.com/training/fpga/fpga-field-
programmable-gate-array.htm).
20. Goldberg RP (1973) Architecture of virtual machines in Pro-
ceedings of the workshop on virtual computer systems. (ACM),
pp. 74–112.
21. Kohler E, Morris R, Chen B, Jannotti J, Kaashoek MF (2000)
The click modular router. ACM Transactions on Computer
Systems (TOCS) 18(3):263–297.
22. Dobrescu M et al. (2009) Routebricks: exploiting parallelism
to scale software routers in Proceedings of the ACM SIGOPS
22nd symposium on Operating systems principles. (ACM), pp.
15–28.
23. Naous J, Gibb G, Bolouki S, McKeown N (2008) Netfpga:
reusable router architecture for experimental research in Pro-
ceedings of the ACM workshop on Programmable routers for
extensible services of tomorrow. (ACM), pp. 1–7.
24. Yang R, Wang J, Clement B, Mansour A (2013) Fpga imple-
mentation of a parameterized fourier synthesizer in Electronics,
Circuits, and Systems (ICECS), 2013 IEEE 20th International
Conference on. (IEEE), pp. 473–476.
25. Kliegl M et al. (2010) Generalized dcell structure for load-
balanced data center networks in INFOCOM IEEE Conference
on Computer Communications Workshops, 2010. (IEEE), pp.
1–5.
26. Overholt M, Wang S (2016) Modularized data center
cube (http://pbg.cs.illinois.edu/courses/cs538fa11/lectures/17-
Mark-Shiguang.pdf).
27. Udeze C, Okafor K, Okezie C, Okeke I, Ezekwe C (2014) Per-
formance analysis of r-dcn architecture for next generation web
application integration in 2014 IEEE 6th International Con-
ference on Adaptive Science & Technology (ICAST). (IEEE),
pp. 1–12.
28. Farrington N et al. (2010) Helios: a hybrid electrical/optical
switch architecture for modular data centers. ACM SIGCOMM
Computer Communication Review 40(4):339–350.
29. Wang G et al. (2010) c-through: part-time optics in data
centers. SIGCOMM Comput. Commun. Rev. 41(4):–.
30. Tan Z (2013) Ph.D. thesis (Department of Electrical Engineer-
ing and Computer Sciences, University Of California, Berke-
ley).
31. Cisco (2012) Cisco’s massively scalable data center network
fabric for warehouse scale computer., (Cisco), Technical report.
32. Alcatel-Lucent (2013) Data center converged solutions design
guide, (Alcatel-Lucent), Technical report.
33. Tan Z, Qian Z, Asanovic XCK, Patterson D (2013) Diablo:
Simulating datacenter network at scale using fpgas, (ASPIRE
UC Berkeley), Technical report.
34. Putnam A et al. (2015) A reconfigurable fabric for accelerating
large-scale datacenter services. IEEE Micro 35(3):10–22.
35. Joost R, Salomon R (2005) Advantages of fpga-based multipro-
cessor systems in industrial applications in 31st Annual Con-
ference of IEEE Industrial Electronics Society, 2005. IECON
2005. (IEEE), pp. 6–pp.
36. Savaš E, Tenca AF, Koç CK (2000) A scalable and unified
multiplier architecture for finite fields gf (p) and gf (2m) in
International Workshop on Cryptographic Hardware and Em-
bedded Systems. (Springer), pp. 277–292.
37. K.C.Okafor, Ezeha G, I.E. Achumba FU, Okezie C, U.H.Diala
(2015) Harnessing fpga processor cores in evolving cloud based
datacenter network designs (dccn) in In Proc.12th Interna-
tional of Conference of Nigeria Computer Society- Information
Technology for Inclusive Development. (Nigerian Computer
Society), pp. 1–14.
38. Morgan TP (2014) How microsoft is us-
ing fpgas to speed up bing search
(http://www.enterprisetech.com/2014/09/03/microsoft-
using-fpgas-speed-bing-search/).
39. Dean J, Ghemawat S (2008) Mapreduce: simplified data
processing on large clusters. Communications of the ACM
51(1):107–113.
40. Chauhan A, Fontama V, Hart M, Tok WH, Buck W (2014)
Introducing Microsoft Azure HDInsight-Technical Overview.
(Microsoft Press).
41. Simmhan Y, Van Ingen C, Subramanian G, Li J (2010) Bridging
the gap between desktop and the cloud for escience applica-
tions in 2010 IEEE 3rd International Conference on Cloud
Computing. (IEEE), pp. 474–481.
12 | Okafor et al.

More Related Content

What's hot

Data Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big DataData Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big Data
ijccsa
 
Power consumption prediction in cloud data center using machine learning
Power consumption prediction in cloud data center using machine learningPower consumption prediction in cloud data center using machine learning
Power consumption prediction in cloud data center using machine learning
IJECEIAES
 
Mobile data gathering with load balanced clustering and dual data uploading i...
Mobile data gathering with load balanced clustering and dual data uploading i...Mobile data gathering with load balanced clustering and dual data uploading i...
Mobile data gathering with load balanced clustering and dual data uploading i...
shanofa sanu
 
Mobile data gathering with load balanced
Mobile data gathering with load balancedMobile data gathering with load balanced
Mobile data gathering with load balanced
jpstudcorner
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
A Review - Synchronization Approaches to Digital systems
A Review - Synchronization Approaches to Digital systemsA Review - Synchronization Approaches to Digital systems
A Review - Synchronization Approaches to Digital systems
IJERA Editor
 
THRESHOLD BASED VM PLACEMENT TECHNIQUE FOR LOAD BALANCED RESOURCE PROVISIONIN...
THRESHOLD BASED VM PLACEMENT TECHNIQUE FOR LOAD BALANCED RESOURCE PROVISIONIN...THRESHOLD BASED VM PLACEMENT TECHNIQUE FOR LOAD BALANCED RESOURCE PROVISIONIN...
THRESHOLD BASED VM PLACEMENT TECHNIQUE FOR LOAD BALANCED RESOURCE PROVISIONIN...
IJCNCJournal
 
Analysis of Link State Resource Reservation Protocol for Congestion Managemen...
Analysis of Link State Resource Reservation Protocol for Congestion Managemen...Analysis of Link State Resource Reservation Protocol for Congestion Managemen...
Analysis of Link State Resource Reservation Protocol for Congestion Managemen...
ijgca
 
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENTDYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
IJCNCJournal
 
HYBRID OPTICAL AND ELECTRICAL NETWORK FLOWS SCHEDULING IN CLOUD DATA CENTRES
HYBRID OPTICAL AND ELECTRICAL NETWORK FLOWS SCHEDULING IN CLOUD DATA CENTRESHYBRID OPTICAL AND ELECTRICAL NETWORK FLOWS SCHEDULING IN CLOUD DATA CENTRES
HYBRID OPTICAL AND ELECTRICAL NETWORK FLOWS SCHEDULING IN CLOUD DATA CENTRES
ijcsit
 
A New Efficient Cache Replacement Strategy for Named Data Networking
A New Efficient Cache Replacement Strategy for Named Data NetworkingA New Efficient Cache Replacement Strategy for Named Data Networking
A New Efficient Cache Replacement Strategy for Named Data Networking
IJCNCJournal
 
Survey on Synchronizing File Operations Along with Storage Scalable Mechanism
Survey on Synchronizing File Operations Along with Storage Scalable MechanismSurvey on Synchronizing File Operations Along with Storage Scalable Mechanism
Survey on Synchronizing File Operations Along with Storage Scalable Mechanism
IRJET Journal
 
Energy Efficient Clustering Algorithm based on Expectation Maximization for H...
Energy Efficient Clustering Algorithm based on Expectation Maximization for H...Energy Efficient Clustering Algorithm based on Expectation Maximization for H...
Energy Efficient Clustering Algorithm based on Expectation Maximization for H...
IRJET Journal
 
IEEE Parallel and distributed system 2016 Title and Abstract
IEEE Parallel and distributed system 2016 Title and AbstractIEEE Parallel and distributed system 2016 Title and Abstract
IEEE Parallel and distributed system 2016 Title and Abstract
tsysglobalsolutions
 
Efficient Design of p-Cycles for Survivability of WDM Networks Through Distri...
Efficient Design of p-Cycles for Survivability of WDM Networks Through Distri...Efficient Design of p-Cycles for Survivability of WDM Networks Through Distri...
Efficient Design of p-Cycles for Survivability of WDM Networks Through Distri...
CSCJournals
 
Ship Ad-hoc Network (SANET)
Ship Ad-hoc Network (SANET)	Ship Ad-hoc Network (SANET)
Ship Ad-hoc Network (SANET)
Benyamin Moadab
 
A Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network ArchitectureA Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network ArchitectureHiroshi Ono
 
Towards a low cost etl system
Towards a low cost etl systemTowards a low cost etl system
Towards a low cost etl system
ijdms
 
Run-Time Adaptive Processor Allocation of Self-Configurable Intel IXP2400 Net...
Run-Time Adaptive Processor Allocation of Self-Configurable Intel IXP2400 Net...Run-Time Adaptive Processor Allocation of Self-Configurable Intel IXP2400 Net...
Run-Time Adaptive Processor Allocation of Self-Configurable Intel IXP2400 Net...
CSCJournals
 

What's hot (19)

Data Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big DataData Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big Data
 
Power consumption prediction in cloud data center using machine learning
Power consumption prediction in cloud data center using machine learningPower consumption prediction in cloud data center using machine learning
Power consumption prediction in cloud data center using machine learning
 
Mobile data gathering with load balanced clustering and dual data uploading i...
Mobile data gathering with load balanced clustering and dual data uploading i...Mobile data gathering with load balanced clustering and dual data uploading i...
Mobile data gathering with load balanced clustering and dual data uploading i...
 
Mobile data gathering with load balanced
Mobile data gathering with load balancedMobile data gathering with load balanced
Mobile data gathering with load balanced
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
A Review - Synchronization Approaches to Digital systems
A Review - Synchronization Approaches to Digital systemsA Review - Synchronization Approaches to Digital systems
A Review - Synchronization Approaches to Digital systems
 
THRESHOLD BASED VM PLACEMENT TECHNIQUE FOR LOAD BALANCED RESOURCE PROVISIONIN...
THRESHOLD BASED VM PLACEMENT TECHNIQUE FOR LOAD BALANCED RESOURCE PROVISIONIN...THRESHOLD BASED VM PLACEMENT TECHNIQUE FOR LOAD BALANCED RESOURCE PROVISIONIN...
THRESHOLD BASED VM PLACEMENT TECHNIQUE FOR LOAD BALANCED RESOURCE PROVISIONIN...
 
Analysis of Link State Resource Reservation Protocol for Congestion Managemen...
Analysis of Link State Resource Reservation Protocol for Congestion Managemen...Analysis of Link State Resource Reservation Protocol for Congestion Managemen...
Analysis of Link State Resource Reservation Protocol for Congestion Managemen...
 
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENTDYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
 
HYBRID OPTICAL AND ELECTRICAL NETWORK FLOWS SCHEDULING IN CLOUD DATA CENTRES
HYBRID OPTICAL AND ELECTRICAL NETWORK FLOWS SCHEDULING IN CLOUD DATA CENTRESHYBRID OPTICAL AND ELECTRICAL NETWORK FLOWS SCHEDULING IN CLOUD DATA CENTRES
HYBRID OPTICAL AND ELECTRICAL NETWORK FLOWS SCHEDULING IN CLOUD DATA CENTRES
 
A New Efficient Cache Replacement Strategy for Named Data Networking
A New Efficient Cache Replacement Strategy for Named Data NetworkingA New Efficient Cache Replacement Strategy for Named Data Networking
A New Efficient Cache Replacement Strategy for Named Data Networking
 
Survey on Synchronizing File Operations Along with Storage Scalable Mechanism
Survey on Synchronizing File Operations Along with Storage Scalable MechanismSurvey on Synchronizing File Operations Along with Storage Scalable Mechanism
Survey on Synchronizing File Operations Along with Storage Scalable Mechanism
 
Energy Efficient Clustering Algorithm based on Expectation Maximization for H...
Energy Efficient Clustering Algorithm based on Expectation Maximization for H...Energy Efficient Clustering Algorithm based on Expectation Maximization for H...
Energy Efficient Clustering Algorithm based on Expectation Maximization for H...
 
IEEE Parallel and distributed system 2016 Title and Abstract
IEEE Parallel and distributed system 2016 Title and AbstractIEEE Parallel and distributed system 2016 Title and Abstract
IEEE Parallel and distributed system 2016 Title and Abstract
 
Efficient Design of p-Cycles for Survivability of WDM Networks Through Distri...
Efficient Design of p-Cycles for Survivability of WDM Networks Through Distri...Efficient Design of p-Cycles for Survivability of WDM Networks Through Distri...
Efficient Design of p-Cycles for Survivability of WDM Networks Through Distri...
 
Ship Ad-hoc Network (SANET)
Ship Ad-hoc Network (SANET)	Ship Ad-hoc Network (SANET)
Ship Ad-hoc Network (SANET)
 
A Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network ArchitectureA Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network Architecture
 
Towards a low cost etl system
Towards a low cost etl systemTowards a low cost etl system
Towards a low cost etl system
 
Run-Time Adaptive Processor Allocation of Self-Configurable Intel IXP2400 Net...
Run-Time Adaptive Processor Allocation of Self-Configurable Intel IXP2400 Net...Run-Time Adaptive Processor Allocation of Self-Configurable Intel IXP2400 Net...
Run-Time Adaptive Processor Allocation of Self-Configurable Intel IXP2400 Net...
 

Similar to 5 1-33-1-10-20161221 kennedy

CLOUD RAN- Benefits of Centralization and Virtualization
CLOUD RAN- Benefits of Centralization and VirtualizationCLOUD RAN- Benefits of Centralization and Virtualization
CLOUD RAN- Benefits of Centralization and Virtualization
Aricent
 
Towards achieving-high-performance-in-5g-mobile-packet-cores-user-plane-function
Towards achieving-high-performance-in-5g-mobile-packet-cores-user-plane-functionTowards achieving-high-performance-in-5g-mobile-packet-cores-user-plane-function
Towards achieving-high-performance-in-5g-mobile-packet-cores-user-plane-function
Eiko Seidel
 
Low power network on chip architectures: A survey
Low power network on chip architectures: A surveyLow power network on chip architectures: A survey
Low power network on chip architectures: A survey
CSITiaesprime
 
CONTAINERIZED SERVICES ORCHESTRATION FOR EDGE COMPUTING IN SOFTWARE-DEFINED W...
CONTAINERIZED SERVICES ORCHESTRATION FOR EDGE COMPUTING IN SOFTWARE-DEFINED W...CONTAINERIZED SERVICES ORCHESTRATION FOR EDGE COMPUTING IN SOFTWARE-DEFINED W...
CONTAINERIZED SERVICES ORCHESTRATION FOR EDGE COMPUTING IN SOFTWARE-DEFINED W...
IJCNCJournal
 
5G Edge Computing Whitepaper, FCC Advisory Council
5G Edge Computing Whitepaper, FCC Advisory Council5G Edge Computing Whitepaper, FCC Advisory Council
5G Edge Computing Whitepaper, FCC Advisory Council
DESMOND YUEN
 
ENHANCING AND MEASURING THE PERFORMANCE IN SOFTWARE DEFINED NETWORKING
ENHANCING AND MEASURING THE PERFORMANCE IN SOFTWARE DEFINED NETWORKINGENHANCING AND MEASURING THE PERFORMANCE IN SOFTWARE DEFINED NETWORKING
ENHANCING AND MEASURING THE PERFORMANCE IN SOFTWARE DEFINED NETWORKING
IJCNCJournal
 
A LIGHT WEIGHT VLSI FRAME WORK FOR HIGHT CIPHER ON FPGA
A LIGHT WEIGHT VLSI FRAME WORK FOR HIGHT CIPHER ON FPGAA LIGHT WEIGHT VLSI FRAME WORK FOR HIGHT CIPHER ON FPGA
A LIGHT WEIGHT VLSI FRAME WORK FOR HIGHT CIPHER ON FPGA
IRJET Journal
 
Hardware virtualized flexible network for wireless data center optical interc...
Hardware virtualized flexible network for wireless data center optical interc...Hardware virtualized flexible network for wireless data center optical interc...
Hardware virtualized flexible network for wireless data center optical interc...
ieeepondy
 
Network on Chip Architecture and Routing Techniques: A survey
Network on Chip Architecture and Routing Techniques: A surveyNetwork on Chip Architecture and Routing Techniques: A survey
Network on Chip Architecture and Routing Techniques: A survey
IJRES Journal
 
20607-39024-1-PB.pdf
20607-39024-1-PB.pdf20607-39024-1-PB.pdf
20607-39024-1-PB.pdf
IjictTeam
 
A 01
A 01A 01
A 01
kakaken9x
 
WIRLESS CLOUD NETWORK
WIRLESS CLOUD NETWORKWIRLESS CLOUD NETWORK
WIRLESS CLOUD NETWORKAashish Pande
 
A Flexible Software/Hardware Adaptive Network for Embedded Distributed Archit...
A Flexible Software/Hardware Adaptive Network for Embedded Distributed Archit...A Flexible Software/Hardware Adaptive Network for Embedded Distributed Archit...
A Flexible Software/Hardware Adaptive Network for Embedded Distributed Archit...
csijjournal
 
FPGA IMPLEMENTATION OF APPROXIMATE SOFTMAX FUNCTION FOR EFFICIENT CNN INFERENCE
FPGA IMPLEMENTATION OF APPROXIMATE SOFTMAX FUNCTION FOR EFFICIENT CNN INFERENCEFPGA IMPLEMENTATION OF APPROXIMATE SOFTMAX FUNCTION FOR EFFICIENT CNN INFERENCE
FPGA IMPLEMENTATION OF APPROXIMATE SOFTMAX FUNCTION FOR EFFICIENT CNN INFERENCE
International Research Journal of Modernization in Engineering Technology and Science
 
Network Virtualization using Shortest Path Bridging
Network Virtualization using Shortest Path Bridging Network Virtualization using Shortest Path Bridging
Network Virtualization using Shortest Path Bridging
Motty Ben Atia
 
Service oriented cloud architecture for improved performance of smart grid ap...
Service oriented cloud architecture for improved performance of smart grid ap...Service oriented cloud architecture for improved performance of smart grid ap...
Service oriented cloud architecture for improved performance of smart grid ap...
eSAT Journals
 
Service oriented cloud architecture for improved
Service oriented cloud architecture for improvedService oriented cloud architecture for improved
Service oriented cloud architecture for improved
eSAT Publishing House
 
Designing network topology.pptx
Designing network topology.pptxDesigning network topology.pptx
Designing network topology.pptx
KISHOYIANKISH
 
Design and Performance Analysis of 8 x 8 Network on Chip Router
Design and Performance Analysis of 8 x 8 Network on Chip RouterDesign and Performance Analysis of 8 x 8 Network on Chip Router
Design and Performance Analysis of 8 x 8 Network on Chip Router
IRJET Journal
 
ANALYSIS OF LINK STATE RESOURCE RESERVATION PROTOCOL FOR CONGESTION MANAGEMEN...
ANALYSIS OF LINK STATE RESOURCE RESERVATION PROTOCOL FOR CONGESTION MANAGEMEN...ANALYSIS OF LINK STATE RESOURCE RESERVATION PROTOCOL FOR CONGESTION MANAGEMEN...
ANALYSIS OF LINK STATE RESOURCE RESERVATION PROTOCOL FOR CONGESTION MANAGEMEN...
ijgca
 

Similar to 5 1-33-1-10-20161221 kennedy (20)

CLOUD RAN- Benefits of Centralization and Virtualization
CLOUD RAN- Benefits of Centralization and VirtualizationCLOUD RAN- Benefits of Centralization and Virtualization
CLOUD RAN- Benefits of Centralization and Virtualization
 
Towards achieving-high-performance-in-5g-mobile-packet-cores-user-plane-function
Towards achieving-high-performance-in-5g-mobile-packet-cores-user-plane-functionTowards achieving-high-performance-in-5g-mobile-packet-cores-user-plane-function
Towards achieving-high-performance-in-5g-mobile-packet-cores-user-plane-function
 
Low power network on chip architectures: A survey
Low power network on chip architectures: A surveyLow power network on chip architectures: A survey
Low power network on chip architectures: A survey
 
CONTAINERIZED SERVICES ORCHESTRATION FOR EDGE COMPUTING IN SOFTWARE-DEFINED W...
CONTAINERIZED SERVICES ORCHESTRATION FOR EDGE COMPUTING IN SOFTWARE-DEFINED W...CONTAINERIZED SERVICES ORCHESTRATION FOR EDGE COMPUTING IN SOFTWARE-DEFINED W...
CONTAINERIZED SERVICES ORCHESTRATION FOR EDGE COMPUTING IN SOFTWARE-DEFINED W...
 
5G Edge Computing Whitepaper, FCC Advisory Council
5G Edge Computing Whitepaper, FCC Advisory Council5G Edge Computing Whitepaper, FCC Advisory Council
5G Edge Computing Whitepaper, FCC Advisory Council
 
ENHANCING AND MEASURING THE PERFORMANCE IN SOFTWARE DEFINED NETWORKING
ENHANCING AND MEASURING THE PERFORMANCE IN SOFTWARE DEFINED NETWORKINGENHANCING AND MEASURING THE PERFORMANCE IN SOFTWARE DEFINED NETWORKING
ENHANCING AND MEASURING THE PERFORMANCE IN SOFTWARE DEFINED NETWORKING
 
A LIGHT WEIGHT VLSI FRAME WORK FOR HIGHT CIPHER ON FPGA
A LIGHT WEIGHT VLSI FRAME WORK FOR HIGHT CIPHER ON FPGAA LIGHT WEIGHT VLSI FRAME WORK FOR HIGHT CIPHER ON FPGA
A LIGHT WEIGHT VLSI FRAME WORK FOR HIGHT CIPHER ON FPGA
 
Hardware virtualized flexible network for wireless data center optical interc...
Hardware virtualized flexible network for wireless data center optical interc...Hardware virtualized flexible network for wireless data center optical interc...
Hardware virtualized flexible network for wireless data center optical interc...
 
Network on Chip Architecture and Routing Techniques: A survey
Network on Chip Architecture and Routing Techniques: A surveyNetwork on Chip Architecture and Routing Techniques: A survey
Network on Chip Architecture and Routing Techniques: A survey
 
20607-39024-1-PB.pdf
20607-39024-1-PB.pdf20607-39024-1-PB.pdf
20607-39024-1-PB.pdf
 
A 01
A 01A 01
A 01
 
WIRLESS CLOUD NETWORK
WIRLESS CLOUD NETWORKWIRLESS CLOUD NETWORK
WIRLESS CLOUD NETWORK
 
A Flexible Software/Hardware Adaptive Network for Embedded Distributed Archit...
A Flexible Software/Hardware Adaptive Network for Embedded Distributed Archit...A Flexible Software/Hardware Adaptive Network for Embedded Distributed Archit...
A Flexible Software/Hardware Adaptive Network for Embedded Distributed Archit...
 
FPGA IMPLEMENTATION OF APPROXIMATE SOFTMAX FUNCTION FOR EFFICIENT CNN INFERENCE
FPGA IMPLEMENTATION OF APPROXIMATE SOFTMAX FUNCTION FOR EFFICIENT CNN INFERENCEFPGA IMPLEMENTATION OF APPROXIMATE SOFTMAX FUNCTION FOR EFFICIENT CNN INFERENCE
FPGA IMPLEMENTATION OF APPROXIMATE SOFTMAX FUNCTION FOR EFFICIENT CNN INFERENCE
 
Network Virtualization using Shortest Path Bridging
Network Virtualization using Shortest Path Bridging Network Virtualization using Shortest Path Bridging
Network Virtualization using Shortest Path Bridging
 
Service oriented cloud architecture for improved performance of smart grid ap...
Service oriented cloud architecture for improved performance of smart grid ap...Service oriented cloud architecture for improved performance of smart grid ap...
Service oriented cloud architecture for improved performance of smart grid ap...
 
Service oriented cloud architecture for improved
Service oriented cloud architecture for improvedService oriented cloud architecture for improved
Service oriented cloud architecture for improved
 
Designing network topology.pptx
Designing network topology.pptxDesigning network topology.pptx
Designing network topology.pptx
 
Design and Performance Analysis of 8 x 8 Network on Chip Router
Design and Performance Analysis of 8 x 8 Network on Chip RouterDesign and Performance Analysis of 8 x 8 Network on Chip Router
Design and Performance Analysis of 8 x 8 Network on Chip Router
 
ANALYSIS OF LINK STATE RESOURCE RESERVATION PROTOCOL FOR CONGESTION MANAGEMEN...
ANALYSIS OF LINK STATE RESOURCE RESERVATION PROTOCOL FOR CONGESTION MANAGEMEN...ANALYSIS OF LINK STATE RESOURCE RESERVATION PROTOCOL FOR CONGESTION MANAGEMEN...
ANALYSIS OF LINK STATE RESOURCE RESERVATION PROTOCOL FOR CONGESTION MANAGEMEN...
 

More from Onyebuchi nosiri

Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...
Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...
Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...
Onyebuchi nosiri
 
Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...
Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...
Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...
Onyebuchi nosiri
 
Implementation of Particle Swarm Optimization Technique for Enhanced Outdoor ...
Implementation of Particle Swarm Optimization Technique for Enhanced Outdoor ...Implementation of Particle Swarm Optimization Technique for Enhanced Outdoor ...
Implementation of Particle Swarm Optimization Technique for Enhanced Outdoor ...
Onyebuchi nosiri
 
Telecom infrastructure-sharing-a-panacea-for-sustainability-cost-and-network-...
Telecom infrastructure-sharing-a-panacea-for-sustainability-cost-and-network-...Telecom infrastructure-sharing-a-panacea-for-sustainability-cost-and-network-...
Telecom infrastructure-sharing-a-panacea-for-sustainability-cost-and-network-...
Onyebuchi nosiri
 
VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...
VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...
VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...
Onyebuchi nosiri
 
Voltage Stability Investigation of the Nigeria 330KV Interconnected Grid Syst...
Voltage Stability Investigation of the Nigeria 330KV Interconnected Grid Syst...Voltage Stability Investigation of the Nigeria 330KV Interconnected Grid Syst...
Voltage Stability Investigation of the Nigeria 330KV Interconnected Grid Syst...
Onyebuchi nosiri
 
VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...
VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...
VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...
Onyebuchi nosiri
 
Quadcopter Design for Payload Delivery
Quadcopter Design for Payload Delivery Quadcopter Design for Payload Delivery
Quadcopter Design for Payload Delivery
Onyebuchi nosiri
 
Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...
Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...
Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...
Onyebuchi nosiri
 
Path Loss Characterization of 3G Wireless Signal for Urban and Suburban Envir...
Path Loss Characterization of 3G Wireless Signal for Urban and Suburban Envir...Path Loss Characterization of 3G Wireless Signal for Urban and Suburban Envir...
Path Loss Characterization of 3G Wireless Signal for Urban and Suburban Envir...
Onyebuchi nosiri
 
Signal Strength Evaluation of a 3G Network in Owerri Metropolis Using Path Lo...
Signal Strength Evaluation of a 3G Network in Owerri Metropolis Using Path Lo...Signal Strength Evaluation of a 3G Network in Owerri Metropolis Using Path Lo...
Signal Strength Evaluation of a 3G Network in Owerri Metropolis Using Path Lo...
Onyebuchi nosiri
 
Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...
Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...
Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...
Onyebuchi nosiri
 
Evaluation of Percentage Capacity Loss on LTE Network Caused by Intermodulati...
Evaluation of Percentage Capacity Loss on LTE Network Caused by Intermodulati...Evaluation of Percentage Capacity Loss on LTE Network Caused by Intermodulati...
Evaluation of Percentage Capacity Loss on LTE Network Caused by Intermodulati...
Onyebuchi nosiri
 
Modelling, Simulation and Analysis of a Low-Noise Block Converter (LNBC) Used...
Modelling, Simulation and Analysis of a Low-Noise Block Converter (LNBC) Used...Modelling, Simulation and Analysis of a Low-Noise Block Converter (LNBC) Used...
Modelling, Simulation and Analysis of a Low-Noise Block Converter (LNBC) Used...
Onyebuchi nosiri
 
Design and Implementation of a Simple HMC6352 2-Axis-MR Digital Compass
Design and Implementation of a Simple HMC6352 2-Axis-MR Digital Compass Design and Implementation of a Simple HMC6352 2-Axis-MR Digital Compass
Design and Implementation of a Simple HMC6352 2-Axis-MR Digital Compass
Onyebuchi nosiri
 
An Embedded Voice Activated Automobile Speed Limiter: A Design Approach for C...
An Embedded Voice Activated Automobile Speed Limiter: A Design Approach for C...An Embedded Voice Activated Automobile Speed Limiter: A Design Approach for C...
An Embedded Voice Activated Automobile Speed Limiter: A Design Approach for C...
Onyebuchi nosiri
 
OPTIMIZATION OF COST 231 MODEL FOR 3G WIRELESS COMMUNICATION SIGNAL IN SUBURB...
OPTIMIZATION OF COST 231 MODEL FOR 3G WIRELESS COMMUNICATION SIGNAL IN SUBURB...OPTIMIZATION OF COST 231 MODEL FOR 3G WIRELESS COMMUNICATION SIGNAL IN SUBURB...
OPTIMIZATION OF COST 231 MODEL FOR 3G WIRELESS COMMUNICATION SIGNAL IN SUBURB...
Onyebuchi nosiri
 
Coverage and Capacity Performance Degradation on a Co-Located Network Involvi...
Coverage and Capacity Performance Degradation on a Co-Located Network Involvi...Coverage and Capacity Performance Degradation on a Co-Located Network Involvi...
Coverage and Capacity Performance Degradation on a Co-Located Network Involvi...
Onyebuchi nosiri
 
Comparative Study of Path Loss Models for Wireless Communication in Urban and...
Comparative Study of Path Loss Models for Wireless Communication in Urban and...Comparative Study of Path Loss Models for Wireless Communication in Urban and...
Comparative Study of Path Loss Models for Wireless Communication in Urban and...
Onyebuchi nosiri
 
 ...
 ... ...
 ...
Onyebuchi nosiri
 

More from Onyebuchi nosiri (20)

Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...
Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...
Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...
 
Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...
Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...
Comparative power flow analysis of 28 and 52 buses for 330 kv power grid netw...
 
Implementation of Particle Swarm Optimization Technique for Enhanced Outdoor ...
Implementation of Particle Swarm Optimization Technique for Enhanced Outdoor ...Implementation of Particle Swarm Optimization Technique for Enhanced Outdoor ...
Implementation of Particle Swarm Optimization Technique for Enhanced Outdoor ...
 
Telecom infrastructure-sharing-a-panacea-for-sustainability-cost-and-network-...
Telecom infrastructure-sharing-a-panacea-for-sustainability-cost-and-network-...Telecom infrastructure-sharing-a-panacea-for-sustainability-cost-and-network-...
Telecom infrastructure-sharing-a-panacea-for-sustainability-cost-and-network-...
 
VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...
VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...
VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...
 
Voltage Stability Investigation of the Nigeria 330KV Interconnected Grid Syst...
Voltage Stability Investigation of the Nigeria 330KV Interconnected Grid Syst...Voltage Stability Investigation of the Nigeria 330KV Interconnected Grid Syst...
Voltage Stability Investigation of the Nigeria 330KV Interconnected Grid Syst...
 
VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...
VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...
VOLTAGE STABILITY IN NIGERIA 330KV INTEGRATED 52 BUS POWER NETWORK USING PATT...
 
Quadcopter Design for Payload Delivery
Quadcopter Design for Payload Delivery Quadcopter Design for Payload Delivery
Quadcopter Design for Payload Delivery
 
Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...
Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...
Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...
 
Path Loss Characterization of 3G Wireless Signal for Urban and Suburban Envir...
Path Loss Characterization of 3G Wireless Signal for Urban and Suburban Envir...Path Loss Characterization of 3G Wireless Signal for Urban and Suburban Envir...
Path Loss Characterization of 3G Wireless Signal for Urban and Suburban Envir...
 
Signal Strength Evaluation of a 3G Network in Owerri Metropolis Using Path Lo...
Signal Strength Evaluation of a 3G Network in Owerri Metropolis Using Path Lo...Signal Strength Evaluation of a 3G Network in Owerri Metropolis Using Path Lo...
Signal Strength Evaluation of a 3G Network in Owerri Metropolis Using Path Lo...
 
Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...
Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...
Investigation of TV White Space for Maximum Spectrum Utilization in a Cellula...
 
Evaluation of Percentage Capacity Loss on LTE Network Caused by Intermodulati...
Evaluation of Percentage Capacity Loss on LTE Network Caused by Intermodulati...Evaluation of Percentage Capacity Loss on LTE Network Caused by Intermodulati...
Evaluation of Percentage Capacity Loss on LTE Network Caused by Intermodulati...
 
Modelling, Simulation and Analysis of a Low-Noise Block Converter (LNBC) Used...
Modelling, Simulation and Analysis of a Low-Noise Block Converter (LNBC) Used...Modelling, Simulation and Analysis of a Low-Noise Block Converter (LNBC) Used...
Modelling, Simulation and Analysis of a Low-Noise Block Converter (LNBC) Used...
 
Design and Implementation of a Simple HMC6352 2-Axis-MR Digital Compass
Design and Implementation of a Simple HMC6352 2-Axis-MR Digital Compass Design and Implementation of a Simple HMC6352 2-Axis-MR Digital Compass
Design and Implementation of a Simple HMC6352 2-Axis-MR Digital Compass
 
An Embedded Voice Activated Automobile Speed Limiter: A Design Approach for C...
An Embedded Voice Activated Automobile Speed Limiter: A Design Approach for C...An Embedded Voice Activated Automobile Speed Limiter: A Design Approach for C...
An Embedded Voice Activated Automobile Speed Limiter: A Design Approach for C...
 
OPTIMIZATION OF COST 231 MODEL FOR 3G WIRELESS COMMUNICATION SIGNAL IN SUBURB...
OPTIMIZATION OF COST 231 MODEL FOR 3G WIRELESS COMMUNICATION SIGNAL IN SUBURB...OPTIMIZATION OF COST 231 MODEL FOR 3G WIRELESS COMMUNICATION SIGNAL IN SUBURB...
OPTIMIZATION OF COST 231 MODEL FOR 3G WIRELESS COMMUNICATION SIGNAL IN SUBURB...
 
Coverage and Capacity Performance Degradation on a Co-Located Network Involvi...
Coverage and Capacity Performance Degradation on a Co-Located Network Involvi...Coverage and Capacity Performance Degradation on a Co-Located Network Involvi...
Coverage and Capacity Performance Degradation on a Co-Located Network Involvi...
 
Comparative Study of Path Loss Models for Wireless Communication in Urban and...
Comparative Study of Path Loss Models for Wireless Communication in Urban and...Comparative Study of Path Loss Models for Wireless Communication in Urban and...
Comparative Study of Path Loss Models for Wireless Communication in Urban and...
 
 ...
 ... ...
 ...
 

Recently uploaded

AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 

Recently uploaded (20)

AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 

5 1-33-1-10-20161221 kennedy

  • 1. Journal of Modern Computer Networks Cloud Based Datacenter Network Acceleration Using FPGA for Data-Offloading Kennedy Chinedu Okafora,1,2 , V. C. Chijindub,1 , G. C. Ononiwua,1 , and O. C. Nosiria,1 a Dept.of Electrical and Electronics Engineering, Federal University of Technology Owerri, Nigeria; b Dept. of Electronic Engineering, University of Nigeria, Nsukka, Nigeria Currently, the high-performance processors in Spine-Leaf, Mesh, and Router layer-3 (SLMR-3) backend server do- main have multiple cores, but data offloading from proces- sor to the peripheral is not keeping pace with the required Quality of Service (QoS) needed to balance the workload on a Warehouse Scaled Computer (WSC) running a de- veloped Enterprise Energy Tracking Analytic Cloud Por- tal (EETACP) data center network. High speed with low latency interconnects between the processors and Field Programmable Gate Array (FPGA) is critical for achiev- ing performance benefits in EETACP deployment. Most of the servers in WSC architectures are running at aver- age utilization rates and perform well under peak process- ing power. These servers are good candidates for FPGA processors in cloud-based data centers owing to its acceler- ation coherency. This paper made a strong case for cloud- based support for EETACP. An FPGA-based Spine-Leaf model is proposed to be an alternative to traditional net- work models for EETACP provisioning. The paper an- alyzed reconfigurable FPGAs, characterized a simplified process model for hyperscale FPGA cloud design descrip- tion. To validate the performance, comparisons was made with two similar networks, namely DCell and BCube for enterprise application deployments. It was concluded that FPGA-based DCN acceleration for EETACP offers accept- able QoS expectations FPGA System on Chip, Cloud Computing, Virtualization, VHDL, Net- work Optimization, Quality of Service 1. Introduction Cloud datacenters are designed and built for various high- performance computing services such as office collaborative tools (e.g, Microsoft Office 360, Google Drive), search engines (e.g. Bing, Google), global stock market analysis, entertain- ment (sports broadcasting, news mining, games, etc.), mecha- tronics integrations and other scientific workloads [1, 2]. Today, the servers in these datacenters are interconnected using ei- ther Spine-Leaf, Mesh or Routed Layer-3 model (SLMR-3) [3]. Cloud application datacenter networks are large and usually connect hundreds of thousands of servers via their lay-er-3 switch fabrics. A good data offloading strategy in the cloud datacenter network is critical to ensure that servers, switch- es, routers, load-balancers as well as its application do not encounter deadly bandwidth bottlenecks due to utilization and over-subscription. This will help to isolate services from each other; and derive more relaxation in workload placement, rather than having to streamline workload placement to where bandwidth is available. Besides, due to the rise of cloud computing integrations, low latency and high throughput datacenter networking (DCN) is now an important area of research. The current SLMR-3 for a typical EETACP deployment context is novel. With cloud provisioning, the beneficial role of FPGA component in datacenter acceleration vis-à-vis SLMR-3, becomes an inter- esting and timely subject. Several contribution and studies on DCN have not considered datacenter acceleration for Quality of Service (QoS improvement. For instance, topology design and routing are the focus in [4–8]. Architectural tiers are the emphasis in [3], [3], [9]. Flow scheduling and congestion con- trol are the consideration in [10], [11], [12]. Virtualization is the focus in [13], [14] while application support is the focus in [15], [16]. In all these studies, little attention has been given to QoS performance using FPGA service processing cores. Since cloud-based DCN is a relatively new exploration area in high- performance networks, many of the designs discussed in [5], [6], [8], [11], [14], and [16] have failed to carry out investigation on DCN acceleration. Using Spine-Leaf FPGA network model has so many benefits for high-performance market segments. According to [17], and [18], a 4-way Layer-3 Leaf/Spine with Equal Cost Multi-Path (ECMP) architectural processing for routing and other computing services is the new template for high-performance datacenter network designs. In cloud-based scenario, the type of switch or even server processor cores can contribute to congestion delays regardless of data offloading strategies. For example, Portland [8], BCube and Quantized Congestion Notification (QCN), [11] uses rate based conges- tion control which is not efficient. Hence, current Ethernet switches, IP routers, server etc, found in the existing data- center architectures, therefore can not be used to implement high-performance datacenter designs. A high-end FPGA System on Chip (FSoC) could be em- ployed for data offloading, leading to improved QoS for en- terprise applications. There two types namely: the Static Random Memory (SRAM) and the Antifuse versions. These are semiconductor devices built on a matrix of Configurable Logic Blocks (CLBs) connected via programmable intercon- nects [19]. By construction, FPGA’s are efficient at executing a pre- dictable workload. Given that datacenter workloads requires high computational capabilities, energy efficiency and low cost, a legacy commodity server cannot satisfy these demands. As such, FSoC can be reprogrammed to offer flexible acceleration The authors declare no conflict of interest Academic Editor: Dr Mohamad Yusof Darus, Senior Lecturer, Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Malaysia 1 All authors contributed equally to this work 2 Corresponding author email: kennedy.okafor@futo.edu.ng JMCN | January 2017 | vol. 1 | no. 1 | 1–12
  • 2. Journal of Modern Computer Networks of workloads. Till date, many cloud datacenters have not deployed FSoC as compute accelerators. Hence, to implement efficient cloud DCN designs, rich programmability is absolutely required in the cloud DCN service processors beside the role of Type-1 bare-metal virtualization [20]. There are two approaches in this regard namely: Pure software-based [21], [22] and FPGA-based programmability such as NetFPGA [23]. Software-based systems can provide full programmability while providing a reasonable packet for- warding rate. Their performance is still not comparable to commodity switch and server FPGA Application Specific Inte- grated Circuits (ASICs). The batch processing used in existing server switches and software-based switches yield optimization that introduces high latency. This is critical for various control plane functions such as signaling and congestion control [6], [8], [11] in high performance networks. Considering bandwidth intensive applications, FPGAs can be designed for low-latency applications. This offers higher value for cloud computing processes. Since FPGA-based sys- tems are fully programmable [24], a datacenter backend can be optimized though in-circuit re-configuration at power-up to support more functions and achieve seamless data-offloading. Hence, the latest trend in the server performance is the data offloading paradigm. It involves pairing an x86 processor with an FPGA device architecture which is highly customizable. With this method, workload performance can be enhanced while accommodating changing needs in the future. Clearly, a data-offloading FSoC will improve the throughput of cloud based Software as a Service (SaaS) by co-processing with a commodity CPU. This same concept can accelerate cloud database searches for improved performance. The major trade- off for acceleration (cloud workload offloading in this case) is that frequent or repetitive tasks or task sequences will affect power demand. As far as this work is concerned, little research has been carried out in literature as per investigating the QoS effects of cloud network server, router, etc., driven by FPGA cores. Hence, there is the need to explore FPGA target device archi- tecture in developing DCCNs for cloud-based services (such the Enterprise Energy Tracking Analytic Cloud Portal, EETACP, e.g., databases, big data analytics and high-performance com- puting). 2. Related Works 2.1 Cloud Datacenter Networking Traditional datacenter network architectures such as DCCN [3], Portland [8], DCell [25], BCube [26], R-DCN [27], He- lios [28], c-Through [29], etc., have been extensively studied. Most of them uses a recursive scheme for scalability and per- formance while others construct a separate optical network with an expensive high port-count 3D MEMS side by side with the existing datacenter to add core bandwidth on the fly. Most DCNs like OmniSwitch which is modular data- center network architecture, integrates small optical circuit switches with Ethernet switches to provide both topological flexibility and large-scale connectivity. These architectures can be re-modified using the enhanced Spine-Leaf, mesh, and router layer-3 (tier-2) models running on a low latency FPGA core. This has not been used in server-centric application deployments strategy. The author in [30] highlighted issues affecting existing commercial off-the-shelf Ethernet switches for these architectures at a high link speed, such as 10gigabits per second (Gbps). The challenges include: (a) Extreme complexities, particularly the switch software, wiring and scaled troubleshooting. (b) Availability of various failure modes in the absence of fail-over schemes. (c) Existing large commercial switches and routers are expen- sive. (d) Some datacenters require high port density at the aggre- gate or datacenter level switches at extremely high link bandwidth. (e) Other issues are over-subscription, microburst detec- tion problems using SNMP polling for TCP sprawl (i.e. many–to-one traffic pattern), high queuing latency, an ab- sence of mobility support for virtual server infrastructure, poor scalability, and inflexibility resulting from legacy designs that has compatibility issues with automated vir- tualized datacenter. Therefore, many researchers have kept on evolving data- center network architectures, with most of them focusing on the novel design philosophy of Spine-Leaf, mesh, and router layer 3 models [31],[32]. The new trend in datacenter network model is to address the issues of optimal performance such as low latency, availabil- ity/fault tolerance, utilization, energy efficiency and scheduling of resources regardless of the network device. Regarding architectural design framework, the most related work in this research is Datacenter-in-a-Box at Low cost (DIA- BLO) FPGA cluster prototype in [30]. The authors discussed a novel cost-efficient evaluation methodology. FPGAs were used, but treated datacenters as whole computers with tightly integrated hardware and software. The work enumerated three viz: i. Server Models: Built on top of RAMP Gold: SPARC V8 ISA, running on full Linux 3.5 with a fixed CPI timing model. ii. Switch Models: Based on circuit and packet switching with abstracted models focusing on switch buffer configurations. iii. NIC Models: Having a scatter/gather DMA with zero copy drivers as well as a NAPI polling support. In integrating the cloud DCN nodes to FPGA cores, Figure 1 illustrates a high-level structure. The system used 6BEES boards having 24Xilinx-Virtex 5FPGAs [30]. The simulation was realized with 3072 servers in ninty-six racks. The network switches were used at 8.4B instruction/second. The validation was on a single rack physical system with sixteen node cluster, 3GHz Xeon + 16 port Asante IntraCore 35516-T switch. The physical hardware setup had two servers + 1 to 14 clients. The software configurations included server protocols: TCP/UDP, server worker threads: 4(default), eight simulated server: sin- gle-core with 4GHz fixed CPI. Figure 2 shows type 1 DIABLO without inter-board connec- tions and type 2 DIABLO with fully-connected with high-speed cables. Type 2 shares similar feature with this work. With 2 | Okafor et al.
  • 3. Journal of Modern Computer Networks Fig. 1. DIABLO cluster physical mapping [33] FPGA and use of programmable hardware platforms, the sim- plification of the load on cloud nodes and network devices will enhance performance. As such, a cloud of general-purpose re-sources (FPGA) was used to offload the processed tasks. Andrew P. [34] in their work, described a reconfigurable fabric (FPGA Catapult) designed to balance some performance concerns. The system was embedded into each half-rack of 48 servers in the form of a small board with a medium sized FPGA and local DRAM attached to each server. As depicted in Figure 2, FPGAs are directly wired to each other in a 6x8 two-dimensional torus, allowing services to allocate groups of FPGAs to provide the necessary area to implement the desired functionality. The work was evaluated by offloading a significant fraction of Microsoft Bing’s ranking stack onto groups of eight FPGAs to support each instance of this service [34]. Based on performance expectations of an earlier proposed EETACP (cloud application deployed on DCCN), the key goals for any datacenter architecture includes [9]: (a) Deterministic latency (b) Redundancy/high availability (c) Manageability/flexibility (d) Excellent resource allocation and scheduling (e) Scalability and fault tolerance An improved network architecture based on FPGA fabric is proposed to achieve those above. This model has been shown to be better than the Spine-Leaf model, mesh and layer 3-routed models owing to the performance characteristics of this de-vice. It supports lower latency, offloading, seamless integration and computing scalability. It is very imperative to outline the advantages and disadvantages of the current Spine-Leaf, mesh and Layer 3-routed network design. This is shown in Table 1. In EETACP DCCN [34], a low latency and fault tolerance network was achieved. In this case, the number of network tiers was to be reduced to minimize system latency. But, an FPGA based fabric structure simplifies management, reduces cost, and allows resilient and low-latency networks to be de- signed just like the Spine-leaf model. The robust architectural concepts supported in the DCCN architectures provide high availability, deterministic low latency and can scale up or down with demand. EETACP was tightly integrated with the Om- niVista™ 2500 Virtual Machine Manager (VMM), providing a unified platform for virtual machine visibility and provisioning with virtual network profile across the network. These allow seamless server interfacing. By introducing FPGA cluster in the above architectures, its advantages in cloud datacenter networks (e.g., DCCN) include: (a) Allows multi-chassis terminated link aggregation groups to be created. (b) Creates a loop-free edge without Spanning Tree Protocol (STP). (c) Provides node-and link-level redundancy particularly with the Integrated Service OpenFlow load balancer. (d) Enables overall architecture to be geo-independent i.e. no co-location support. (e) Active support for Inter-connect switches using standard 10G and 40G Ethernet optics. (f) Supports redundancy and resiliency across the switches connecting EETACP servers. In Web-scale data centers, by boosting performance with a few FPGA device architecture across thousands of servers, this will save cost. Besides, by leveraging FPGAs for acceleration in Spine-Leaf models, this will improve dynamic over-allocation (change management for large-scale data centers, because en- terprise tools must track the FPGA algorithm as it is updated. This is needful for enterprise adoption. With the availability of server virtualization, a hyper-scale datacenter could use FPGA capabilities. This paper opines that new processor architectures based on a programmable FPGA-device have several advantages to cloud service provisioning. It allows for scalability on demand and loosely coupled system designs. R. Joost, & Salomon [35] showed that FPGAs are best suited for Okafor et al. JMCN | January 2017 | vol. 1 | no. 1 | 3
  • 4. Journal of Modern Computer Networks Fig. 2. DIABLO cluster prototype with 6 BEE3 boards[30] Table 1. Advantages and Disadvantages of the current Spine/Leaf, Mesh, and Layer 3-Routed Network Design Model Advantages Disadvantages Spine-Leaf Model • Offers layer 2/3 common fabric implementation • Facilitates simpler design • Fewer interconnects • Easy to scale within boundary and better la- tency transition • Additional layer of transit hop may impact latency and over subscrip- tion • Scalability limited to number of ports in the spine layer Mesh- Model • Offers layer 2/3 differentiated fabric implemen- tation • Implementation highly scalable • No transit hop • Lower latency and lower over-subscription ra- tions More links used for interconnects Lay-3 Routed Model • Offers end to end routed fabric implementation • Easy to secure in IP layer • Fewer interconnects • Easy to scale • Highly oversubscribed architecture • Number of transit hops is not deter- ministic, impacting latency • Complex design and maintenance 4 | Okafor et al.
  • 5. Journal of Modern Computer Networks tackling most industrial and network based applications, such as supervisory control systems, cloud computing, Internet of things, and other grid computing services. It is shown that FP- GAs are very powerful, relatively inexpensive, and adaptable. This is because their configuration is specified in an abstract hardware description language. FPGA-based implementations combines many advantages such as rapid development cycles, high flexibility, re-usability, moderate costs, easy upgrading (due to the usage of abstract Hardware Description Languages (HDLs), and feature extension (as long as the FPGA is not exhausted). For the network pieces in the cloud DCN, the FPGA core on the servers, switches, and load balancers, are managed by a management console in the form of Software Defined Network (SDN) that separates the data, control and application layer planes. In context, for updating a switching policy, the network is initially mapped in the design thereby maintaining a default state and eliminating routine reprogram- ming of the FPGA logic cells. However, the use of FPGA can complement other chipset-accelerators (i.e., GPUs) but at the expense of writing new procedure in VHDL. The issues of power consumption and area-on-chip are vital for performance considering the requirement of FPGA cores needed in the network. This is a trade-off for future research. 3. Methodology In this section, FPGA modular description is presented. A characterization scenario was used as a basis for generaliza- tion. To achieve this, an Electronic Design Simulation tool (Riverbed Modeled) with extended C++ library was employed in this study. Due consideration was made on FPGA Virtex UltraScale driven server machine. This was used for the Spine- Leaf DCCN design as it offers efficient performance, good system integration, and bandwidth with the added benefits of re-programmability. In the enterprise setup, the periph- eral controllers include general purpose I/O, UART, Timer, Debug, SPI, DMA Controller, Ethernet (interface to exter- nal MAC/PHY chip). Also, the memory controllers include SRAM, Flash, SDRAM, and DDR SDRAM. In context, the scalability of the Virtex UltraScale VU440 device is made possible by its ASIC-class architecture – for up to 90% utilization featuring next generation routing, ASIC-like clocking, resource utilization, power management, and elimi- nation of interconnect bottlenecks, and critical path optimiza- tions. Its key architectural blocks include wider multipliers, high-speed memory cascading, 33G capable transceivers, and the addition of industry-leading integrated 100Gb/s Ethernet MAC and 150Gb/s IP cores. These devices enable multi- hundred gigabit per second levels of system performance with smart processing at full line rates. Figure 3 shows a proof of concept demonstrating the initial testbed setup for EETACP deployment. The configuration facilitates the dual-housing of servers/storage and access devices with links distributed across the DCCN switches. There is no logical loop between the edge devices and multi-chassis peer switches, even though a physical loop exists. Single interface servers, storage and edge devices can be connected to any DCCN switch via a virtualization management console. The setup is based on general purpose processor. Using the Type-1 bare metal virtualization offers the feasibility of VM instances which supports failovers, repli- cation, and redundancy for a production environment. The assumption in this research is that the FPGA concept as well as the Type-1 bare metal virtualization must be integrated in a case of myriads of servers, i.e., massively scaled datacenter to derive the expected QoS. An FPGA scalable architecture [36] offers a template for adoption in DCCN. Specifically, a Xilinx FPGA comparison show-ing an optimal configuration for Virtex UltraScale device has been enumerated in [19]. In the work, the logic Cells (K), Ul-traRAM (Mb), Block RAM (Mb), DSP Slices, Transceiver Count, Maximum Transceiver Speed (Gb/s), Total Transceiver Bandwidth (full duplex) (Gb/s), Memory Interface (DDR3), Memory Interface (DDR4), PCI Express, Configuration AES, I/O Pins and I/O Voltages were all compared against other device architecture variants showing VirtexUltraScale device as the most preferred choice. This further facilitated its adoption in the proposed DCN design in section 3. The FPGA-based system implementations have the following characteristics: (a) Allows for the integration of soft-core processors. (b) Have plenty of logic resources for routing (c) Have plenty of RAM supports. This observation combined with the lack of bypass path, led to a multi-threaded design of large modules. In the validation analysis, this work will focus on FPGA-based datacenters for performance benchmarking. It must be stat-ed that congestion offloading is derived through the use of over- allocation considered in Figure 3. In this paper, the prototype design of the cloud based datacenter has only been tested with a very small testbed running realistic micro-benchmarks for cloud computing services. The emphasis is on the QoS comparison with related datacenter cores. The role of Type-1 virtualization as a DCN accelerator is presented in Section 3.1 3.1 Architectural Model (Type 1-Virtualization The goal of the FPGA-based network server model (DCCN) is to have a credible workload generation that is scaleable, and efficient with respect to QoS for a congested traffic pool. A highly-accurate framework for the cloud computing workload was developed in Figure 4. At the core, the server clusters must be capable of running complex server application software with minimum modifications. In context, an FPGA service model is responsible for executing the target procedure (router, switch or server CPU) correctly as well maintaining the device architectural state in congested networks. By using the Type -1 virtualization strategy, this is made feasible. The benefits of this management scheme include: (a) Simplified mapping of the functional FPGA model. The separation allows complex operations to take multiple host cycles. For example, a highly-ported register file can be mapped to a block RAM and accessed in multiple host cycles, avoiding a large, slow mapping to FPGA registers, multiplexers, etc. (b) Improved flexibility and reuse of resources even in over- allocation mode. With it, precise server timing model can be changed without modifying the overall network model. This will improve efficiency. For instance, it is Okafor et al. JMCN | January 2017 | vol. 1 | no. 1 | 5
  • 6. Journal of Modern Computer Networks Fig. 3. DCCN EETACP server testbed (Kswitche Labs, 2015) possible to use the same VM switch model to simulate both 10Gbps switches and 100Gbps switch-es just by changing the timing model only. (c) It enables a highly-configurable abstracted timing model. In the virtualized datacenter, splitting timing function allows the timing model to effect abstraction in the cloud layer. When looking closely at the FPGA characteristics for network architectures, this work identified a wide vari- ety of design choices such as switch architecture, network topology, protocols, and applications. To support data-intensive processing in an FPGA-based do- main, the traffic workloads must be optimized in the cloud environment. As such, optimization via data management must be satisfied for enhanced QoS. 3.2 FPGA Cloud Datacenter Specifications This paper used the specification of the cloud datacenter network described in [37]. However, using a Type-1 server virtualization is considered for resource management in an FPGA driven DCCN. The network fabrics have OpenFlow load-balancer, virtual gateway, server instances on the hy- pervisor. In the network, a three-stage Clos topology using Nexus 7000 (spine) and Nexus 3000 platform (leaf) with FPGA based N-Servers connected to them, forms a Warehouse-scale cloud dat-acenter. These runs on 10-40Gbps links. These specifications are encapsulated in Figure 4. 3.3 Hyper-Scale Cloud Cluster Server (HCCS) The FPGA card used in the Spine-Leaf cloud network server (as shown in Figure 4 is depicted in Figure 5. This is based on Xilinx Virtex UltraScale FPGA technology, (i.e., target device). The characterization in the HCCS is mainly for the Spine-Leaf DCCN. For data-offloading at the server core, this prototype FPGA accelerator card has six Virtex-6 FPGAs linked together by a PCI-Express switch from PLX Technology. Three of them are fixed into a Supermicro SuperServer designed to accommo- date three Tesla GPU coprocessors in DCCN. This has a pair of six-core Xeon 5600 class processors as shown in Figure 5. This processor core is depicted Figure 3 while Figure 6 shows the logical placement. In this case, the server machine has 24 half-wide grid sockets. This pattern allows the X86 server processor (grouped into two) to fit into the testebed rack enclosure. On the server, the FPGA co-processor in Figure 6 has eight lanes mapping to Mini-SAS xSFF-8088 connectors, with two ports on each FPGA card. This is to speed up data cycling and improve utilization cycles of the CPU. The server has enough space for PCI-Express 3.0 periph- eral card situated at the back of the server sled. It has two eight-core Xeon processors running at 2.1GHz CPUs with 64GB of main memory (DRAM). For its storage capacity, four 2TB disk drives (4HDDs-2TB) and two 512 GB solid state disks (2SSDs) was introduced. The server node has a 10 Gb/sec Ethernet port with redundant power supplies. Wireless connectivity via the bay ports is at default in the DCCN. The DCCN server FSoC accelerator card configured in a production setup is distributed across server cluster infras- tructure. In the deployment context for HCCS-DCCN, two sets of cables are used to implement a ring connection with six xSFF-8088 connectors. Also, eight connectors are used on a ring for duplication/redundancy. With the six-adapter cables, the six FPGA cards (in six adjacent server nodes in the server chassis) are mapped to each other with one set of Mini-SAS ports. The complex arrangement allows eight different groups of FSoC-nodes in a 48-node pod to be self- linked using eight adapter cables. During its operation, the FSoC run at 10Gb/sec even for all the Ethernet connected interfaces. Figure 7 shows Virtex UltraScale VU440 device used for service processing cores. This provides the highest system perfor-mance and bandwidth for large-scale computing. This is very good for a typical server scenario in Figure 3. 3.4 HCCS FSoC Data-Offloading Algorithm Algorithm I describes the server interconnection read and write operations with FPGA data offloading. Firstly, after defining the server configuration with its virtualization mappings, a 10GB link is used for current link interconnection in the cluster subnet. An array of user input jobs through a load balancer Lm for none zero term is defined for the server. To facilitate read operation from the server, the variable controls (a, N, i, j) are used to execute successive read operations in matrix form. 6 | Okafor et al.
  • 7. Journal of Modern Computer Networks Fig. 4. Cloud Computing Spine-Leaf Cluster (DCCN) Fig. 5. A Typical Cloud FPGA Accelerator Network Card [38] Fig. 6. A modified logical Interfacing in DCCN Subnet Cluster[38] Okafor et al. JMCN | January 2017 | vol. 1 | no. 1 | 7
  • 8. Journal of Modern Computer Networks Fig. 7. An FPGA based Virtex UltraScale VU440 device architecture for cloud Server board To complete successfully, j control checks for equal availability of server CPUs and their VMs. The first step to job processing is to select the shortest path (i.e., the one with the highest throughput) between the user job request and the server VM in the HCCS. As the workload increases, more bandwidth is over-allocated by the hypervisor virtual machine monitor (VMM). This translates into an increased throughput along the path. All the processed workloads are returned through the shortest-path to end-users and the cycles re-initializes and repeats the read and write operations. Using Algorithm 1, the EDA study was used to explore the capabilities of Virtex UltraScale FSoC for DCCN data- offloading. In study, the RAMs on FSoC are used to store the simulation thread state. This dynamically switches the threads in order to keep the data pipelines saturated. This is memory strategy is called HCCS host-multithreading for low latency data-offloading. The benefits are summarized below. • Availability of hard-wired DSP blocks with execution units, especially Floating Point Units (FPUs). This domi- nates Look Up Tables (LUT) resource consumption. The implication is that by mapping functional units to DSP blocks rather than just LUTs, more resources are reserved to execution timing. • DRAM accesses are relatively fast on FSoC. The logic in FSoC often runs slower than DRAM because of on- chip routing delays. This insight will greatly simplify host memory system, as large, associative caches are not needed for high performance. The tradeoff between the QoS performance and FPGA compute resources is the overall server cost budget parameter. 4. Simulation Validation 4.1 Experimental Design Description First, FPGA server process model was built for DCCN VM clusters. This was realized using Riverbed modeller academic edition 17.5 with its C++ libraries1 as an EDA tool. The implementation was on a heavily modified host-cache design. The server model supports full 32-bit OS. At the core, the Virtex UltraScale was emulated into the service processors shown in Figure 4. In the real setup (depicted in Figure 4), the components introduced includes the server farm virtual 1 https://splash.riverbed.com/community/product- lines/steelcentral/university-support-center/blog/2014/06/11/riverbed- modeler-academic-edition-release firewall router SFV, emu-lated OpenFlow controller OC, ap- plication and profile configuration windows. This test center configuration sets-up the Web, Database, FTP, and Exchange servers, such as DCCN server1, server 2, server 3, server 4, server 5,. . . .. . . .N and six location with active users. The system servers run on Virtex Ultra Scale FPGA target de- vice. With Type-1 virtualization, servers are placed on the DCCN as VM clusters. The VM connect user tasks to the HCCS which processes services concurrently. The application (HTTP service) runs on the OpenFlow controller whose job is to dispatch the requests to the server clusters. This facilitates resource allocation, scheduling and load balancing in the DCCN. The simulation experiments were per-formed on an emulated cloud, at the IaaS level, using the datacenters cardinality theory. For the DCCN VM clusters, two physical servers (2X-8-core Xeon 2.1 GHz CPUs, 64GB DRAM, 4HDDs-2TB, 2SSDs-512GB, 10GB Ethernet, with Linux & Mac OS) were configured to run on the CPU model. The VM instances were created according to the workloads per site. For acceleration, Type-1 full/active virtualization, failover and over-allocation were simultaneously enabled to address the is-sues highlighted in Section 2.1. The process model experimental methodology considered four key metrics: Service process latency, throughput, resource availability and resource utilization. The execution time is measured using the timer functions provided by the C++ trace file diagnostic library. The throughput is determined at destination as a ratio between the amount of data sent from users and the service processing time. Finally, of the metric is computed based on riverbed frame-work/simulation for DCCN, DCell, and BCube. Each QoS metric was reported in the plots discussed below. 4.2 Performance Evaluation After setting up three distinct network scenarios (DCell and BCube) alongside the FPGA-based DCCN, a focused discus- sion on its services as well as its performance were analyzed in a previous study [38]. The first scenario measures the improve- ments brought to I/O-intensive FPGA applications from ser- vice process throughput perspective. By adaptively switching end users from the leaf to the spine models, the servers read and process request concurrently. This occurs within the data center management which replicates processed multicast jobs and transfers them in a pipeline fashion within the deploy- ment. This paper presents a set of results obtained from a QoS comparison among the three networks using the remote cloud storage. This host services such as FTP, database/storage, etc. 4.3 Analysis of Non-FPGA Cloud DCNs The experiment in context focused on the comparative analy- sis of three distinct DCNs viz: Spine-Leaf DCCN (proposed), DCell, and BCube for network throughput, resource availabil- ity and resource utilization. These networks were configured using a scenario based approach. In this case, the cloud computing application workload is homogenous. A suitable frame-work used to evaluate the impact of FPGA acceleration on the cloud datacenter is the MapReduce [39]. This makes cloud-based computation flexible though with its performance thread-offs in the cloud. This work used emulated cached MapReduce engine [40], a general-purpose workflow engine 8 | Okafor et al.
  • 9. Journal of Modern Computer Networks Algorithm I: DCCN Server Read/Write Operations Procedure FPGA Dataoffloading, Read/Write // the idea is to use FPGA to carry out data offload via read and write operation Define DCCN-Server I/O // A distributed cloud computing server must have a well defined inputs and outputs Program Server matrix (Input, Output) ConstMaxS = 𝑆 𝑛+1; // recursive server chain ensures that the servers redundancies are maintained While j ≤ 𝐾 do 𝑆. 𝑉𝑚 = 𝑉 𝑚+1; // recursive server virtual instances for internal server //resources (I/O, RAM, CPU, etc) FPGA acceleration = shortestpathjoboffload // initialization Set link = 10GB // interconnection links // initialization If Var = Var +1, then Sort with FSoC // the Var is used to allocate memory spaces on the CPU for read/write operation provided they are not used up by the CPU Var 𝑃0, 𝑄1, 𝑅1, 𝑁𝑘+1: Array [0……………MaxN, 0…………MaxN] of real nonzero term; a,N,i,j: integer; // a = security term , (N, i ,j = control loop variables) end if end while Begin // read operation Readln (N); // Read user jobs from CPU While i ≤ 𝑗 do For i:= 0 to N-1 do for j;= 0 to N-1 do read (P[i]);// this now implements read job/task request For i := 0 to N-1 do for j;= 0 to N-1 do read (Q[i]); For i:= 0 to N-1 do for j;= 0 to N-1 do read (R[i]); For i:= 0 to N-1 do for j;= 0 to N-1 do read (𝑁𝑘+1[i]); For i:= 0 to N-1 do for j;= 0 to N-1 do read r[i]: = P[i] + Q[i] + R[i]………𝑁𝑘+1[i]); For i:= 0 to N-1 do for j;= 0 to N do If 𝑗 = 𝑆 𝑛+1 then // Get the job request threads with maximum throughput While j: i+1 to N do If a [j] <≠ a [MinSec] then 𝑆 𝑛+1 ≠ 1 Dataoffload > = 0 // server CPU else Return; end if end while end if Transferjob.shortestpath = NextPath // recursive CPU server chain ensures that workloads are efficiently transferred using the shortest path. end while end procedure Fig. 8. HCCS-DCCN Read/write Algorithm Fig. 9. Throughput Stability Response Okafor et al. JMCN | January 2017 | vol. 1 | no. 1 | 9
  • 10. Journal of Modern Computer Networks Fig. 10. Cloud Server Utilization Response [41] to run trace file statistics. For the three scenarios, the number of mappers (32 MB per job), data sizes (1024MB) and reducers (3) was maintained in all cases. From Figure 9, it was observed that the proposed FSoC-DCCN had relatively a better throughput with optimal virtual instance allocation coordinator. In this regard, the average throughput stability responses for DCCN, DCell and BCube are 40.00%, 33.33%, and 26.67% respectively. With Type-1 virtualization of Spine-Leaf DCCN server cores alongside with the FSoC acceleration, relative perfor- mance is feasible. Scientific workflows running in large, geo- graphically distributed and highly dynamic computing environ- ments can efficiently use FSOC-DCCN. This is because FSoC based platforms can effectively satisfy throughput stability requirement in a production deployment. From Figure 10, resource availability refers to the ability to access the FSoC- DCCN server clusters on demand while completing the job requests. The complexity of cloud datacenters architecture, its overall infrastructure makes resource utilization an-other important parameter. It was observed that the proposed FSoC- DCCN offered better resource utilization (for the work-loads) compared with BCube and DCell scenarios. When all existing resources in FSoC-DCCN server clusters are used up by means of over allocation, additional resources can be reserved for high priority jobs that arrive. In context, when a job arrives, the availability of the VM is guaranteed. The issue will be on the availability of resources to execute the jobs. If the VM is available, then job is allowed to run on the VM via dynamic allocation considering the network density. This occurs only for Type-I virtualization on the cloud DCN Spine-Leaf model. It was shown that the proposed scheme had about 58.06% resource utilization (i.e. when logically isolated with FPGA device cores) while the others offered 38.71% (BCube) and 3.23% (DCell) respectively (ie. when not logically isolated with FPGA cores). The implication is that FPGA- based DCCNs will offload tasks from server processors more frequently than other accelerator options since cloud service processing rates are high. It also implies that the proposed model will offer fairly good resource availability leading to enhanced performance. This makes it more attractive in Hyper- scale datacenters for Warehouse Scaled Computers (WSC). Hence, Vm based cloud networks particularly in the cell based and Spine-Leaf WSC can benefit from this advantage. From the plots in Figure 9 and Figure 10, network infrastruc- ture that processes bandwidth intensive applications will scale optimally with FSoC. This is because, a key potential benefit of the integrated processor and FPGA system is the ability to boost system performance by accelerating compute-intensive functions in FPGA logic (i.e. hardware acceleration and cache coherency) while making more resources available. The proces- sor performance is improved by the FSOC co-processing roles, particularly from computing cyclic-redundancy check (CRC) to offloading the entire TCP/IP stack. When the FPGA-based accelerator produces a new result, the data needs to be passed back to the processor as quickly as possible, so that the proces- sor can update its view of the data. As a validation, a network case of 1632 servers with FPGAs running an enterprise search service on the web was analyzed with 11. This shows an im- proved throughput with FPGA acceleration compared with the case without FPGA acceleration in terms query latency responses. Another key benefit of integrating a General Purpose Pro- cessor (GPP) and FPGA into a single real estate is the abil- ity to ac-celerate system performance by offloading critical functions to the FPGA. Transferring the data quickly and coherently is key to realizing performance boost in cloud- based networks. Datacenter network optimization with FPGA acceleration im-proves bandwidth efficiency while satisfying QoS metrics. Using any network equipment embedded with FPGA processors, this would eradi-cate various performance bottlenecks that software driven processors cannot overcome. Smart computing and intelligence ap-plications having massive workload will benefit from this alternative. 5. Conclusion This paper has presented a super-scalar cloud datacenter net- work built with FPGA core support. This offers excellent throughput, low latency and good resource utilization when compared with DCell and BCube datacenter networks. Hence, offloading key functions from the processor to the FPGA can result in substantial improvement in the system performance while reducing system power drain. As observed in existing Warehouse Scaled Computing (WSCs), high speeds with low latency interconnect between the processors and FSOC are 10 | Okafor et al.
  • 11. Journal of Modern Computer Networks Fig. 11. FPGA Query latency Behavior[38] necessary for optimal performance. The proposed datacen- ter net-work offers memory coherency through the use of the FPGA acceleration coherency. With this, issues of bandwidth, performance, integration, and power requirements are fixed. In highly dynamic environments, various types of computing work-loads (such as databases, big data analytics, and high performance computing, performance) can be improved us- ing the proposed FPGA acceleration in Spine-Leaf datacenter model. As more and more workloads are being deployed in the cloud, it is appropriate to consider how to make FPGAs and their capabilities available in the cloud. Hence, the proposed system offers a low latency path from the network interface to the consuming process, irrespective of network workloads. As a proof of concept and validation, a micro testbed setup on real life datacenter was explored. The work used DCCN to model a datacenter Spine-Leaf architecture running traffic patterns sampled from the Riverbed application engine on top of Linux-KVM and Virtex Ultra Scale FPGA target device. This enables isolation between multiple processes in multiple VMs such as accurate acceleration, resource allocation, and priority-based workload scheduling for QoS. The results from the FPGA DCCN offloading strategy in Spine-Leaf designs show that Type-1 virtualization influences re-source alloca- tion and scheduling. With FPGA acceleration, performance of cloud computing systems particularly in QoS contexts is enhanced. Consequently, newer processors can use FPGAs to accelerate applications (workload optimization). Furthermore, with WSC (FPGA based servers), the Central Processing Unit (CPU) of Spine-Leaf topologies can easily offload tasks to FPGA device architectures for hardware acceleration. The con- clusion is that with global deployment of FPGA-based cloud datacenters, this will enable large-scale scientific workflows to improve performance and deliver fast responses re-grading QOS. Future work will focus on mathematical modeling and state analysis of Markovian queue on heterogeneous FPGA cloud based servers and their working vacation. The work will investigate on power drain on high-density network, the chip area, comparison with other GPUs /accelerators. ACKNOWLEDGMENTS. We wish to specially thank Cloud Computing and Distributed Systems (CLOUDS) Laboratory at the University of Melbourne Australia; Department of Electronic Engineering, UNN; Center for Basic Space Science, UNN; Energy Commission of Nigeria,-NCERD-UNN, and National Agency for Science and Engineering Infrastructure (NASENI) for their immense support in course of this research work. References 1. Microsoft (2016) Azure successful stories, online : http://www.windowsazure.com/en-us/case-studies/archive/. 2. Lu G et al. (2011) Serverswitch: A programmable and high performance platform for data center networks in In NSDI’11 Proc. of 8th USENIX conference on Networked systems design and implementation. 3. Okafor K, Ugwoke F, Obayi AA, Chijindu V, Oparaku O (2016) Analysis of cloud network management using resource allocation and task scheduling services. International Journal of Advanced Computer Science & Applications 1(7):375–386. 4. Guo C et al. (2008) Dcell: a scalable and fault-tolerant net- work structure for data centers. ACM SIGCOMM Computer Communication Review 38(4):75–86. 5. Al-Fares M, Loukissas A, Vahdat A (2008) A scalable, com- modity data center network architecture. SIGCOMM Comput. Commun. Rev. 38(4):63–74. 6. Guo C et al. (2009) Bcube: a high performance, server-centric network architecture for modular data centers. ACM SIG- COMM Computer Communication Review 39(4):63–74. 7. Greenberg A et al. (2009) Vl2: a scalable and flexible data center network. ACM SIGCOMM computer communication review 39(4):51–62. 8. Niranjan Mysore R et al. (2009) Portland: A scalable fault- tolerant layer 2 data center network fabric. SIGCOMM Com- put. Commun. Rev. 39(4):39–50. 9. D KC (2016) Ph.D. thesis (University of Nigeria Nsukka). 10. Okafor K, Nwaodo T (2012) A synthesis vlan approach to congestion management in datacenter internet networks. Inter- national Journal of Electronics and Telecommunication System Research 5(6):86–92. 11. Alizadeh M et al. (2008) Data center transport mechanisms: Congestion control theory and ieee standardization in Commu- nication, Control, and Computing, 2008 46th Annual Allerton Conference on. (IEEE), pp. 1270–1277. 12. Al-Fares M, Radhakrishnan S, Raghavan B, Huang N, Vahdat A (2010) Hedera: Dynamic flow scheduling for data center networks. in NSDI. Vol. 10, pp. 19–19. 13. Wood T (2011) Ph.D. thesis (University of Massachusetts Amherst). 14. Guo C et al. (2010) Secondnet: a data center network virtual- ization architecture with bandwidth guarantees in Proceedings of the 6th International COnference. (ACM), p. 15. 15. Shieh A, Kandula S, Sirer EG (2010) Sidecar: building Okafor et al. JMCN | January 2017 | vol. 1 | no. 1 | 11
  • 12. Journal of Modern Computer Networks programmable datacenter networks without programmable switches in Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks. (ACM), p. 21. 16. Abu-Libdeh H, Costa P, Rowstron A, O’Shea G, Donnelly A (2010) Symbiotic routing in future data centers. ACM SIGCOMM Computer Communication Review 40(4):51–62. 17. Arsita (2015) Arista universal cloud network white paper (https://www.arista.com). 18. Cisco (2013) Cisco fabric path technology and de- sign brkdct-2081 (http://www.valleytalk.org/wp- content/uploads/2013/08/BRKDCT-2081-Cisco-FabricPath- Technology-and-Design.pdf). 19. Xilinx (2016) Field programmable gate array (fpga) (https://www.xilinx.com/training/fpga/fpga-field- programmable-gate-array.htm). 20. Goldberg RP (1973) Architecture of virtual machines in Pro- ceedings of the workshop on virtual computer systems. (ACM), pp. 74–112. 21. Kohler E, Morris R, Chen B, Jannotti J, Kaashoek MF (2000) The click modular router. ACM Transactions on Computer Systems (TOCS) 18(3):263–297. 22. Dobrescu M et al. (2009) Routebricks: exploiting parallelism to scale software routers in Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles. (ACM), pp. 15–28. 23. Naous J, Gibb G, Bolouki S, McKeown N (2008) Netfpga: reusable router architecture for experimental research in Pro- ceedings of the ACM workshop on Programmable routers for extensible services of tomorrow. (ACM), pp. 1–7. 24. Yang R, Wang J, Clement B, Mansour A (2013) Fpga imple- mentation of a parameterized fourier synthesizer in Electronics, Circuits, and Systems (ICECS), 2013 IEEE 20th International Conference on. (IEEE), pp. 473–476. 25. Kliegl M et al. (2010) Generalized dcell structure for load- balanced data center networks in INFOCOM IEEE Conference on Computer Communications Workshops, 2010. (IEEE), pp. 1–5. 26. Overholt M, Wang S (2016) Modularized data center cube (http://pbg.cs.illinois.edu/courses/cs538fa11/lectures/17- Mark-Shiguang.pdf). 27. Udeze C, Okafor K, Okezie C, Okeke I, Ezekwe C (2014) Per- formance analysis of r-dcn architecture for next generation web application integration in 2014 IEEE 6th International Con- ference on Adaptive Science & Technology (ICAST). (IEEE), pp. 1–12. 28. Farrington N et al. (2010) Helios: a hybrid electrical/optical switch architecture for modular data centers. ACM SIGCOMM Computer Communication Review 40(4):339–350. 29. Wang G et al. (2010) c-through: part-time optics in data centers. SIGCOMM Comput. Commun. Rev. 41(4):–. 30. Tan Z (2013) Ph.D. thesis (Department of Electrical Engineer- ing and Computer Sciences, University Of California, Berke- ley). 31. Cisco (2012) Cisco’s massively scalable data center network fabric for warehouse scale computer., (Cisco), Technical report. 32. Alcatel-Lucent (2013) Data center converged solutions design guide, (Alcatel-Lucent), Technical report. 33. Tan Z, Qian Z, Asanovic XCK, Patterson D (2013) Diablo: Simulating datacenter network at scale using fpgas, (ASPIRE UC Berkeley), Technical report. 34. Putnam A et al. (2015) A reconfigurable fabric for accelerating large-scale datacenter services. IEEE Micro 35(3):10–22. 35. Joost R, Salomon R (2005) Advantages of fpga-based multipro- cessor systems in industrial applications in 31st Annual Con- ference of IEEE Industrial Electronics Society, 2005. IECON 2005. (IEEE), pp. 6–pp. 36. Savaš E, Tenca AF, Koç CK (2000) A scalable and unified multiplier architecture for finite fields gf (p) and gf (2m) in International Workshop on Cryptographic Hardware and Em- bedded Systems. (Springer), pp. 277–292. 37. K.C.Okafor, Ezeha G, I.E. Achumba FU, Okezie C, U.H.Diala (2015) Harnessing fpga processor cores in evolving cloud based datacenter network designs (dccn) in In Proc.12th Interna- tional of Conference of Nigeria Computer Society- Information Technology for Inclusive Development. (Nigerian Computer Society), pp. 1–14. 38. Morgan TP (2014) How microsoft is us- ing fpgas to speed up bing search (http://www.enterprisetech.com/2014/09/03/microsoft- using-fpgas-speed-bing-search/). 39. Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Communications of the ACM 51(1):107–113. 40. Chauhan A, Fontama V, Hart M, Tok WH, Buck W (2014) Introducing Microsoft Azure HDInsight-Technical Overview. (Microsoft Press). 41. Simmhan Y, Van Ingen C, Subramanian G, Li J (2010) Bridging the gap between desktop and the cloud for escience applica- tions in 2010 IEEE 3rd International Conference on Cloud Computing. (IEEE), pp. 474–481. 12 | Okafor et al.