1. Open Packet Processing Acceleration
Nuzzo, Craig, cnuzz2@uis.edu
Summary
The amount of data in our world is growing rapidly, this is obvious. However, the behind the
scenes impacts of this growth may not seem as apparent. All of that data has to be controlled by
something, and the answer has always been the computer networks. Over the years, these computer
networks have been refined by means of their inner-workings, as they have moved from bare-metal
systems to visualized to software defined. The problem that the industry faces is the bottleneck for
keeping up with the speed of the Internet without any new bleeding-edge hardware architectures
available yet. With the slowing of Moore’s Law, the hardware is now only getting a little faster, we
have to look at the software instead. The amount of data is so large, that we need to have a discussion
on what else can be done in order to accelerate the packets that move that data around the Internet. This
is where Xen and OpenDataPlane come into play for the VirtuOR group. After a brief overview of the
solution and a summary of ODP in depth, we will then look at the solution in detail and what results
VirtuOR saw in the end.
VirtuOR addresses part of this problem by implementing a solution to accelerate packet
processing, therefore achieving a smaller bottleneck for data. The choice of Xen is due to the high
compatibility within ODP. This choice does not come lightly since ODP can drastically impact the
performance of all processes on a device, up to 89% (Braham, 2016, p. 408). They chose to manipulate
the existing Xen architecture (Braham, 2016, p. 408) with an implementation of the OpenDataPlane
(ODP) project. The new Xen architecture (as shown in Figure 1) will virtualize the CPU cores since
they will be integrated into a virtual privileged domain called driver domain. This will achieve
accelerated packet processing without the overhead of the physical CPU cores.
The ODP is an open-sourced project that allows application programmers a easy to use
programming environment for data plane applications (OpenDataPlane, 2014). They achieve this by
providing common APIs, utilities and configuration files for the underlying hardware. The goal of ODP
is to create a data plane application framework for many different platforms. The accelerated packet
processing described in this report utilizes the Pull Model packet processing scheme from ODP.
The Pull Model (as shown in Figure 2) basically organizes packets with a Scheduler function.
The advantage here is to prioritize desired packets for faster processing in the long run (Braham, 2016,
p. 409). All of this is dependent on the number of threads as ODP is dependent on how many CPU
cores are allocated to the application. This means that each thread will use all the resources available to
accelerate the packets. This speed is controlled by the number of allocated cores, or in this case threads,
launched by the application. All of these ideas are placed into a Linux virtual machine within the driver
domain of the new Xen architecture (as shown in Figure 1). The responsibility of the driver domain is
to add or remove the number of virtual CPU cores used by the ODP, this will achieve the accelerated
packet processing. ODP will then launch threads corresponding to the number of virtual cores in the
driver domain. Those threads then continue to accelerate the processing of packets without loading up
the physical CPU. The beauty of virtualized CPU cores is the fact that adding more of them has no
influence on the underlying physical CPU of the system (Braham, 2016, p. 410). Replacing the
physical CPU cores with the virtual CPU cores in the driver domain is the crux of VirtuOR’s solution.
In the end, the use of ODP saw some advantages, which include but are not limited to: 1)
compatibility with the majority of NICs and drivers in the market and 2) classification of different
packet flows with functions from ODP by means of better prioritization of packets for monitoring. The
real life implementation by VirtuOR was within their Metamorphic Networks platform (M-Net). The
2. platform has the ability to remove, create or move dynamically the VMs within the Xen environment
(as shown in Figure 3). The M-Net utilizes the TRILL protocol connected though a wired network of
two physical nodes. The TRILL provides simple forwarding and speed since it calculates the shortest
path based on a combination of IS-IS protocol and Dijkstra algorithm (Braham, 2016, p. 410). All
traffic going to different domains is then managed by the driver domain and ODP. The results of the
solution were tested on two M-Net devices equipped with 2.5GHz Intel core 2 duo processor and four
Intel 82571EB Gigabit Ethernet cards (Braham, 2016, p. 411) and featured a proprietary Linux
distribution developed by VirtuOR that contained the new Xen architecture. The parameters used to
evaluate were the following: maximum reached throughput, number of processed packets, band-width
use percentage and use percentage of the virtual and physical CPU resources on both architectures
(Braham, 2016, p. 411). As show in Figure 4, the packet processing of the new architecture has a gain
of 15% when the number of virtual CPU cores is more than 1. The throughput evaluation concluded
that 958 Mbits/s is achievable. Bandwidth use percentage comparison showed that this 95% use of
bandwidth happens with 2 virtual CPU cores in the new architecture. They observed that the only CPU
resources used for packet processing were virtual ones. This came out to be 89% for the new
architecture and 9.4% for the old (Braham, 2016, p. 412). For future work, the VirtuOR team hopes to
compare their solution to other packet processing accelerators in the industry.
Future Work
One of the main reasons this paper was chosen is the fact that it exemplifies the open-source
community. By bringing together multiple open-source solutions, a new one is born. The team at
VirtuOR brings together three main open-sourced projects: the Xen Project, the Linux kernel and the
OpenDataPlane project. Together this allows them to come up with a solution for faster packet
processing within their own solutions. This is something that is on the rise. We see more and more open
code than ever before. Microsoft has even recently joined the Linux Foundation and they have opened
up their .NET coding platform. Many additional companies are unloading their code to places like
GitHub for the public to see. This growth will only help the packet processing and software defined
networking in order to speed up the Internet further. The collaboration is becoming a healthy solution
for the networking industry. Two related computer topics in open-source include cloud computing and
graphics processing. The integration of these platforms may help out the software-defined networking
of packet processing.
The cloud has become a popular option amongst the modern day IT Department. This gives
them the ability to concentrate on improving their code without having the physical overhead of
running in house servers. An implementation like that of this research paper would most definitely help
improve those services. Not only would it improve the cloud service for the business, but also for the
client if they are able to utilize the software-defined accelerated packet processing as an option or by
default. This would be a lucrative transaction for either parties.
Another consideration may be to take advantage of Graphic Processing Units (GPU). The
modern GPU architecture can offer computational throughput that is quite high and the memory is very
efficient. The GPUs for this particular application would be benefit from being both software and
hardware. The GPU is inexpensive and more readily available than the many CPUs. Even more
impressive is the fact that 60-200ns of latency can be removed from the ability to retrieve data from
main memory (Kalia, Zhou, & Andersen, n.d.). This paired with code being written in CUDA or
OpenCL would do wonders to a project like Metamorphic Networks is conducting.
The implementation of accelerated packet processing may not be the only thing in the OSI
Model that can be virtualized. Research to virtualize other aspects of computer networking could be
3. considered. Academia and enterprise already use network virtualization to not only learn about
networking concepts, but apply it to real world solutions. We see this in software-defined networking
implementations already. This idea could also be used to sandbox certain aspects in networking in order
to escape the inevitable demise of cyber attacks. This modular code could help IT Departments avoid
unnecessary attacks by being able to remove and replace networking concepts at a software control
panel or in the command prompt itself.
The continued research and implementations of accelerated packet processing is so important
now more than ever. Companies should be looking into this as a serious consideration as their network
stacks are overran by massive amounts of data. As 4K video is being pushed out into the wild, video
streaming services should look to implement some of the aforementioned ideas. That would do us all
some good.
Citations
Rabia, T., Braham, O., & Pujolle, G. (n.d.). Accelerating packet processing in a Xen environment With
OpenDataPlane. 2016 IEEE 30th International Conference on Advanced Information Networking and
Applications, 408-413.
OpenDataPlane™ Introduction and Overview [An in depth introduction to the OpenDataPlane.].
(2014, January).
Kalia, A., Zhou, D., & Andersen, D. G. (n.d.). Raising the Bar for Using GPUs in Software Packet
Processing. Carnegie Mellon University and Intel Labs