"One Variable to Control Them All", a novel formulation of using Openflow, to achieve Network Virtualization, SDN, Network Function Virtualization, Service Chain QoS. An application in Docker networking is demo shown in www.daolicloud.com
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
One Variable to Control Them All for Openflow (and Application in Docker Networking)
1. One Variable to Control Them All for Openflow
Wenbo Mao
DaoliCloud Company, Beijing, Shanghai, China
www.daolicloud.com
nvi at daolicloud dot com
October 30, 2014
Abstract
This article proposes a novel and general-purpose formulation for using the Openflow standard. The
proposed technology for using Openflow can achieve various network services and functions
provisioning: network virtualization, network function virtualization, overlay network, service chain
functions, etc. These network services and functions are provisioned by strictly following the original OSI
seven-layered model of higher layer packet being payload data of lower layer. Common practices of
encapsulating a lower layer packet into a higher layer one, such as MAC/IP in UDP/IP, are compared with
the proposed technology with revelation of being cost at no functionality gain or even worse having
some undesirable effects. We provide a number of application scenarios of the proposed technology to
manifest why and how the proposed formulation of using Openflow are not only much more effective
and efficient than the common practices, but also sufficient.
1. Problem Statement
In order to control, manage and monitor network traffic, or apply useful and/or security functions to
achieving quality network services, it is inevitable to first identify network traffic: only with sure
identification can desired services/functions be provided/applied. The networking industry has long
been in a seemingly endless course of labelling packet headers for such identification purpose. Labelling
needs space in packet headers which are fixed by the OSI standard and have already been exhaustedly
used up by various earlier services/functions in the history of network technology development. A very
clever solution to the lack of labelling space in packet headers is packet encapsulation: whenever in
need of a new label for identifying a new service/function, the packet in question is encapsulated as
payload data into a labelling packet. The packet being encapsulated becomes a higher layer data for the
encapsulation servicing packet, even though in most cases the former is a low layer one. That is why we
often see “MAC (L2) in UDP (L4)”, “IP (L3) in UDP”, or “MAC in MPLS (L3/L2 in L2)” encapsulation
formulations. Such lower-layer-in-higher-layer packet encapsulation formulations actually do not comply
with the original OSI standardized packet formulation, however the industry has been in a seemingly
endless course of proposing various such non-OSI-standard formulations: e.g., VXLAN, STT, Geneve, to
name a few very recent ones. Unfortunately, any new packet encapsulation format, in particular ones
for services/functions being provided/applied toward network trunk-ward part of service chain, often
relates to forklift upgrade to networking devices, obviously very slow and costly. Worse problems with
labelling/encapsulation include that such formulations place ad-hoc control information into forward
2. path to impair flexibility, generality and programmability of network control, or worse even limit
network scalability. We will discuss a number of drawbacks and reveal bandwidth inefficiency in
labelling/encapsulation techniques. Packet header labelling/encapsulation method is somewhat like
programming without using variables, which explains why the industry has been in an endless such
standardization course, since each non-programmable proposal is only for solving one problem in an ad-hoc
manner.
2. Openflow
The Openflow Protocol Standard ingeniously envisioned the need of separation of control path and
forward path for computer networking technologies. This simple and correct idea has been widely
accepted by the computer networking industry and academia to have fast adopted into a rarely
important standard. Many useful practices, many yet to come, will be with us as all other treasures to
add to the wealth of mankind.
3. One Variable to Control Them All for Openflow
Communication is always necessarily started by an initiator, and optionally answered by one or more
responder(s). This model is also true in a service chain scenario where an initiator may be a service
provisioning party such as network carrier who provides/applies network service/function to a set of
identified packets. In the remainder of this article we shall use initiator as the unified name for this
necessary entity which either creates packet(s) as the true communication initiator to send, or modifies
packet(s) as a service provisioning initiator to forward, to communication/service provisioning peer(s).
3.1. Unique Identification of a Communication Initiator
In the OSI model, the transport layer (L4) contains a variable to label a packet flow or a set of packets in
a flow. This is either initiator’s port in TCP/UDP and the like protocols, or initiator’s identifier in ICMP
and the like protocols. In the remainder of this article let us use “(L4) Initiator’s Context Tag” (ICT for
short) as the unified term to name this L4 variable. ICT is originally designed by the OSI standard for an
initiator to identify and distinguish the communication context which has been initiated by itself. ICT is a
16-bit variable and can thus distinguish 65536 different flows. The 16-bit variable contains sufficiently
high entropy for practically identifying all communications flows that a terminal initiator (e.g., a laptop
in an intranet, a guest OS of a virtual machine) can practically manage. If a an initiator is a servicing
entity at some trunk-ward part of the communication data path (e.g., a server host OS, an intranet
gateway, a data center gateway, or a carrier gateway), then upon detecting a collision of an ICT which
come from more than one entities at some edge-ward part of the data path, then the servicing initiator
can create new ICTs, called servicing ICTs (S-ICTs for short) to replace the collision ICT and maintain the
correct and unique mappings between the new S-ICTs to the collision ICT. The uniqueness of the
mappings is guaranteed because the collision makers, i.e., the edge-ward entities, have other different
header contexts, e.g., their different L3 IPs and/or different L2 MACs, for the servicing initiator to
distinguish. To this end, we know that the 16-bit-entropy ICT variable is sufficiently high for any initiator
3. to uniquely identify and distinguish all packet flows or sets of packets in all flows. With this uniqueness
identification/distinction property, any initiator, be it a terminal one or servicing one, upon seeing an ICT
in a packet flow or a set of packets, knows exactly its/their flow identity. To this end, we have solved the
problem for a communication initiator to identify and distinguish all packets in terms of communication
flows. The granularity of the identification/distinction is sufficiently fine to be able to distinguish a single
and unique communication entity within the lifetime of a flow which is related to this communication
entity.
We have so far described a practical and effective method to uniquely identify and distinguish a flow
along a service chain at initiators’ half. Each initiator in a chain of service maintains a triple variable: (L4
ICTs, L3 IPs, L2 MACs) to map to a triple variable of the immediate edge-direction initiator being served.
When packets of a flow travels from edges trunk-ward in the service chain, the servicing initiator(s)
maintains the unique identification and if necessary a unique mapping between two triple variables, and
when packets of a flow travel back from trunks edge-ward, the servicing initiator(s) can use the
maintained information to resume the triple variable for the edge-ward initiator being served so that
the flow can be recognized by the latter. This chain of maintenance of mappings between two triple
variables involves modification to all or part of L4 ICTs, L3 IPs, and/or L2 MACs in the packet headers.
A flow in so maintenance/mapping is in essence recognized by the physical identity of an initiator. By
speaking of physical identity, we mean that a mapping between an edge-ward ICT and the immediately
trunk-ward adjacent ICT not only involves the use of IPs/MACs, but more importantly can trace back to
some more essential contexts of initiators being served, e.g., uuids which is fixed as a permanent
attribute even IPs/MACs may change. Notice that IPs are architected to be logical, while MACs can
change upon an entity moves physical location, e.g., VM migration. It is for this can-be-traced-back-to-permanent-
identity attributes of the maintenance/mapping we have formulated, we say that an
initiator of a communication flow has a “Physically Associated Address” (PAA). Within the lifetime of a
flow, PAA of an initiator is globally unique.
For a servicing entity which handles a PAA, the PAA uniquely defines a communication flow which
originates from an initiator for an intended communication peer. Therefore, we can view that a PAA
constitutes a globally unique unicast cable linking between the PAA initiator and the intended
communication/servicing responder. As we have discussed above, this unicast cable is in essence
identified by the physical identity of the initiator. Below let us see how this logically unicast cable
reaches the intended communication responder peer.
3.2. Unique Identification of a Communication Peer of an Initiator
With the ingenious idea of separation of control path and data path set forth by the Openflow standard,
a (in most cases, servicing) initiator upon servicing a PAA identified flow, can call for help from an
Openflow controller to pass the PAA mapping information to the intended communication/servicing
peer. In the Openflow standard, this is the so-called packetin/packetout flow establishment. A
packetin/packetout flow establishment is always between a pair of peers: an initiator and the intended
responder. The controller, with the global view of the both peers via the control path, can help to
establish/agree on the unique flow-based PAA between the two peers via the control path linking the
controller and the two peers. With the help of the Openflow controller using the separate control paths,
4. we now know that there indeed exists a PAA unicast cable between any pair of globally distributed
initiator/responder. Therefore, with the help of the Openflow controller, the unique PAA can indeed
unambiguously instruct both two peers in the data path to provider networking
services/functions/managements.
A PAA identified flow which is agreed upon between two communications/servicing peers can contain
sufficient amount of additional coding, mapping, management, and control information for the two
peers to know the intended communication semantics between them. Let us use “PAA metadata” to
name such additional information. Notice that PAA metadata is transmitted through the control path
with the help of an Openflow controller; that is, PAA metadata can be arbitrarily large, and need not be
limited to fixed spaces in packet headers. This is the key observation to enable PAA to code all and any
network service/function data between the two communications/servicing peers. Upon seeing a PAA,
the two peers can look up the PAA metadata to understand the communications/servicing semantics
between the two peers in the two ends of the PAA identified unicast cable.
One might think that Openflow needs a globally know-see-reach-all controller. This is a common
misunderstanding on Openflow. In fact, an Openflow controller can be specific and ad-hoc for a given
service/function/tenant. A (note, NOT the) controller only need to mind business for a given task of
service/function/tenant, and hence only need to look after a given pair of initiator/responder peers. An
Openflow controller can even be ad-hoc constructed upon initiation of a PAA flow. By no means there is
no whatever need for one to think an Openflow controller to have a globally all businesses and all data
paths seeing, knowing, and controlling all power. On the contrary, an Openflow controller can be very
small for a specific servicing business or a given tenant.
To this end, we have provided all necessary know-how information for the “One Variable to Control
Them All” Openflow formulation and technology. In the remainder of the article let us provide a number
of application scenarios which shall manifest the power and usefulness of the proposed technology.
4. Application Examples
4.1. Overlay Network
Let T-initiator, S-initiator, T-responder, S-responder denote the terminal and servicing initiators and
responders, respectively.
T-initiator’s PAA: (L2: T-Initiator’s MAC; L3: T-initiator’s overlay IP; L4: T-initiator’s ICT);
An S-initiator’s PAA at an S-gateway: (L2: S-gateway MAC; L3: S-initiator’s underlay IP; L4: S-initiator’s
ICT);
and
FlowEstablishmentPacketIn: PAA metadata;
where
5. PAA metadata in FlowEstablishmentPacketIn is only sent to the Openflow Controller for it to forward to
the intended responding peer via the control pass, and is never transmitted in the data path between
the two peers.
Notice that, as we have discussed in Section 3.1, S-initiator’s ICT needn’t be equal to T-initiator’s ICT.
Non-equal cases are most likely because S-initiator sees a collision of ICT from two T-initiators under its
service provisioning. Upon such an event, S-initiator can maintain a unique mapping between its two
ICTs and the collision ICT of two T-initiators; the mappings contain T-initiator’s overlay IP to suffice the
uniqueness.
Upon sending out the first packet of a new flow by T-initiator, S-initiator will issue an Openflow packetin
to an “overlay network service controller” for the controller’s help agreeing on a flow-based PAA
between itself and the target servicing responder (S-responder). S-initiator will also modify the T-initiator’s
PAA into its own PAA by modifying L3 IP and/or L2 MAC. Upon forward a responding packet
back to the T-initiator, S-initiator will modify its PAA back to the original T-initiator’s PAA.
The Openflow controller knowing the target S-responder (in fact, even knowing the T-responder, since
in this overlay network use case, the “overlay network controller” is actually owned by the overlay
network owner, e.g., a cloud tenant) can help agree on the PAAs between the two servicing peers. We
can write respective PAAs at the responders’ end as follows:
T-responder’s PAA: (L2: T-responder’s MAC; L3: T-responder’s overlay IP; L4: T-responder’s servicing
port);
An S-responder’s PAA at an S-gateway: (L2: S-gateway MAC; L3: S-responder’s underlay IP; L4: S-responder’s
ICT);
and
FlowEstablishmentPacketOut: PAA metadata;
where
PAA metadata in FlowEstablishmentPacketIn is only received from the Openflow Controller and is never
transmitted in the data path between the two peers.
S-responder does the modification between the two PAAs in the similar manner processed by S-initiator.
This overlay network use case covers all formulations of using labelling/encapsulation to construct
tunnels for achieving overlay network. For example, if “PAA metadata” contains a 24-bit “Virtual
Network Identifier (VNI)” or tenant identity, then it covers the case of VXLAN; the only difference is that
the “PAA metadata” passed through the control path need not be limited to 24 bits, and therefore
tenant isolation needn’t be limited to serving 16 million different tenants only.
For the overlay network use case to cover MPLS is also obvious: Applying “One Variable to Control Them
All” for Openflow where Customer Edge (CE) and Provider Edge (PE) in MPLS are in places of T and S of
this technology, and let “PAA metadata” contain the semantic information for the MPLS label.
In all overlay network use cases for tenants in multi-tenant clouds, VXLAN’s VNI like underlay packet
labelling is unnecessary. Upon establishing a new flow for T-initiator and T-responder, S-initiator, S-
6. responder know the identities of the two terminal peers respectively since they must know from/to
which virtual end points the respective terminal packets are received/sent. Once a flow is established,
the uniquely correct routing information PAAs in the flow table will play the role of network isolation for
tenants. Flow-based routing thus only performs a per-flow checking isolation once upon the flow
establishment, unlike the case of VXLAN’s VNI like underlay labelling/checking where both servicing
peers have to perform per-packet-frame labelling/checking. It is the control-forward path separation of
Openflow that turns networking forward job much more efficient than that in the old-fashioned non-separation
technologies. We shall also notice that having avoided encapsulation, packets header will no
longer expand size, and thus “One Variable to control Them All” formulation for using Openflow will
never cause packet fragmentation/reassemble which form another vector of inefficiency for ad-hoc
encapsulation protocols.
We remark that it is also quite interesting to realize that the fixed constant labelling method in all
labelling/encapsulation protocols will repeatedly and mundanely forwards the same constant overlay
header information in all packet frames of a flow if the flow contains more than one frames (such as a
large file or a video stream). To this end it becomes apparent that “One Variable to Control Them All”
formulation of using Openflow is much more bandwidth efficient using the valuable data path recourse
having avoided mundane transmitting completely redundant junk with zero quantity of information for
network control. In addition, “One Variable to Control Them All” formulation of using Openflow opens
up many more use cases to be described below.
4.2. IPsec
To cover the IPsec function, “PAA metadata” can contain descriptions for authentication/encryption
algorithms and metadata, e.g., crypto checksum, and/or random initial vector (IV) which is needed for a
semantically secure encryption scheme.
For this use case to be applied to IPv4, encryption of payload should be conducted on L5 and above,
leaving L4 header in clear in order for the two servicing peers to agree and understand the mapping and
coding. We notice that servicing ICTs are random numbers, and the Openflow control path is encrypted
(SSL), thus exposing L4 with random ICTs will not raise any security issue, just like encryption algorithms
and non-crypto-key parameters should not be secret as a well-known and must follow cryptographic
discipline.
Block cipher algorithms used for encryption should not expand block size. This requirement is met by
most block ciphers and modes of operation, such as CBC, CFB, OFB, CTR, etc standards.
If this use case is applied to IPv6, then even L4 can be encrypted. This is because IPv6 header includes a
20-bit field for flow information. This field can place ICT.
4.3. Dynamic Multipath Traffic Engineering and Load-Balancing
An S-initiator may detect a terminal traffic in need of multipath TE or load-balancing. It can dynamically
summon for TE or LB controllers to find multiple of S-peers to provider multipath TE or LB. Notice that
“PAA metadata” can serve such QoS services by identifying terminal peers precisely. It is obvious that
7. ad-hoc underlay labelling technology cannot provider the pinpointing job because the data-path fixed
label, e.g., a VXLAN VNI cannot differentiate terminal points/services, since it can only identify a set of
terminal points sharing the tunnel.
4.4. Network Function Virtualization
Monitoring network performance and behavior need fine granularity identification of terminal/servicing
points. With arbitrary “PAA metadata” which can be dynamically established between a pair of S-peers,
NFV for monitoring anomaly behavior can be performed to isolate the troubled point/terminal.
From the application examples described so far, one with average knowledge and education in
networking technology would be able to understand that “One Variable to Control Them All” for
Openflow technology can indeed be applied to various more network services and functions WLOG, to
form a methodological replacement for all previous technologies which attempt to identify network
packets by placing various ad-hoc control labels in packet headers.
5. Truly Scalable and Interoperable Cloud
In the overlay network use cases we provided in Section 4.1, S-ICT is a random number agreed upon
between the two servicing peers. Both servicing peers can use it to code the flow routing information
between their respective gateways and their T-peers they serve respectively. Upon establishment a new
flow, S-initiator can offer several ICTs for S-responder to pick one which has not been used at its end.
Thus, with the help of the Tenant SDN Controller (TSC), the two servicing ends may be different clouds
and each needn’t expose its intranet information to the other S-peer. TSC needn’t know the intranets
information of the two cloud either. Therefore, “One Variable to Control Them All” formulation of using
Openflow naturally supports inter-cloud network patching: independently orchestrated clouds (e.g.,
Openstack orchestration) can be in cooperation to serve truly large and elastic cloud services for tenants
(we name this: “One Cloud Two Openstack”), just like today’s electricity grid allowing any power plants
to join in to input electricity. In contrast, an SDN controller running a labelling/encapsulation protocol in-between
two clouds essentially is performing network orchestration of an enlarged cloud since it sees
and controls the whole data path which contains two sections of the link layer within both clouds (we
name this: “One Cloud One Openstack”). That is exactly why the labelling/encapsulation protocols get
another name: “larger layer 2 protocols”; indeed, the controller sees and controls a large link layer
consisting of sections in both cloud intranets. Unfortunately, a large cloud containing too many
communication entities—“One Cloud One Openstack”—cannot be operated in stability.
6. Conclusion
The idea of separation of control path and data path by Openflow opens up a completely new way to
control and manage networking. This article exposes for the first time an observation of using the
Openflow control path to agree on control and management information between network servicing
peers: “One Variable to Control Them All” formulation of using Openflow. The newly discovered
8. observation is further without-loss-of-generality applied to a number of use case scenarios to manifest
many aspects of advantages that the new way of using Openflow over conventional ad-hoc and
seemingly endless upgrading processes of labelling then encapsulation technologies. With the Openflow
standard being widely adopted, the new “One Variable to Control Them All” formulation of using
Openflow in essence reveals the end of an era for ad-hoc labelling/encapsulation protocols.
Announcement
A public cloud which is based on “One Variable to Control Them All” for Openflow technology is on open
beta trial. It can connect trans- datacenters, trans- independent cloud orchestration platforms (so called
One Cloud Two Openstack), trans- multiple forms of overlay instances such as Docker containers, virtual
machines, and hardware servers, with tenant specified logical IP VPCs which are completely decoupled
from the underlay physical network. Please sign-up for a free trial account at:
www.daolicloud.com
(NVI stands for “Network Virtualization Infrastructure” and is an intellectual property of DaoliCloud
Company)
Acknowledgements
I would like to thank all employees of DaoliCloud for their excellent implementation of NVI, for collective
and sustained innovations together, and for the hardwork to have made very useful contributions.
A discussion with Baoping XUE of Sugon was helpful to shape the exposition of this article.