1. April 2013 I www.dcseurope.info 13
www.snia-europe.org
SNIA
e u r o p e
#dcsarticle http://www.dcseurope.info/n/qpaw
How DCB makes iSCSI better
By Allen Ordoubadian,
SNIA ESF Marketing Chair, Emulex
A challenge with traditional iSCSI deployments is the non-
deterministic nature of Ethernet networks. When Ethernet networks
only carried non-storage traffic, lost data packets where not
a big issue as they would get retransmitted.
However; as we layered storage traffic over Ethernet, lost data
packets became a “no no” as storage traffic is not as forgiving as
non-storage traffic and data retransmissions introduced I/O delays,
which are unacceptable to storage traffic. In addition, traditional
Ethernet also had no mechanism to assign priorities to classes of I/O.
Therefore a new solution was needed. Short of creating a separate
Ethernet network to handle iSCSI storage traffic, Data Center Bridging
(DCB), was that solution.
The DCB standard is a key enabler of effectively deploying iSCSI over
Ethernet infrastructure. The standard provides the framework for high-
performance iSCSI deployments with key capabilities that include:
£ Priority Flow Control (PFC) - enables “lossless Ethernet”, a
consistent stream of data between servers and storage arrays. It
basically prevents dropped frames and maximizes network
efficiency. PFC also helps to optimize SCSI communication and
minimizes the effects of TCP to make the iSCSI flow more reliably.
£ Quality of Service (QoS) and Enhanced Transmission Selection
(ETS) - support protocol priorities and allocation of bandwidth for
iSCSI and IP traffic.
£ Data Center Bridging Capabilities eXchange (DCBX) - enables
automatic network-based configuration of key network and iSCSI
parameters.
With DCB, iSCSI traffic is more balanced over high-bandwidth 10GbE
links. From an investment protection perspective, the ability to support
iSCSI and LAN IP traffic over a common network makes it possible to
consolidate iSCSI storage area networks with traditional IP LAN traffic
networks. There is also another key component needed for iSCSI
over DCB. This component is part of Data Center Bridging eXchange
(DCBx) standard, and it’s called TCP Application Type-Length-Value,
or simply “TLV”! TLV allows the DCB infrastructure to apply unique
ETS and PFC settings to specific sub-segments of the TCP/IP traffic.
This is done through switches that can identify the sub-segments
based on the TCP socket or port identifier that are included in the
TCP/IP frame. In short, TLV directs servers to place iSCSI traffic
on available PFC queues that separates storage traffic from other
IP traffic. PFC also eliminates data retransmission and supports a
consistent data flow with low latency. IT administrators can leverage
QoS and ETS to assign bandwidth and priority for iSCSI storage
DCB, NFS,
pNFS and
VN2VN
Members of SNIA’s
Ethernet Storage Forum
discuss data center
bridging, file sharing
protocols and Fibre
Channel over Ethernet.
2. 14 www.dcseurope.info I April 2013
SNIA
e u r o p e
traffic, which is crucial to support critical applications. Therefore,
depending on your overall datacenter environment, running iSCSI
over DCB can improve performance by insuring a consistent stream
of data, resulting in “deterministic performance” and the elimination
of packet loss that can cause high latency; quality of service through
allocation of bandwidth per protocol for better control of service levels
within a converged network and network convergence.
For more information on this topic or technologies discussed in this
blog, please visit www.snia.org/forums/esf
Why NFSv4.1 & pNFS are
better than NFSv3 could
ever be
By Alex McDonald,
SNIA ESF Chair - File Protocols, NetApp
NFSv4 has been a standard file sharing protocol since 2003, but
has not been widely adopted; party because NFSv3 was “just good
enough”. Yet, NFSv4 improves on NFSv3 in many important ways;
and NFSv4.1 is a further improvement on that. In this post, I explain
the how NFSv4.1 is better suited to a wide range of datacenter
and HPC use than its predecessor NFSv3 and NFSv4, as well as
providing resources for migrating from NFSv3 to NFSv4.1. And, most
importantly, I make the argument that users should, at the very least,
be evaluating and deploying NFSv4.1 for use in new projects; and
ideally, should be using it wholesale in their existing environments.
The background to NFSv4.1
NFSv2 (specified in RFC-1813, but never an Internet standard) and its
popular successor NFSv3 was first released in 1995 by Sun. NFSv3
has proved a popular and robust protocol over the 15 years it has
been in use, and with wide adoption it soon eclipsed some of the
early competitive UNIX-based filesystem protocols such as DFS and
AFS; NFSv3 was extensively adopted by storage vendors and OS
implementers beyond Sun’s Solaris; it was available on an extensive
list of systems, including IBM’s AIX, HP’s HP-UX, Linux and FreeBSD.
Even non-UNIX systems adopted NFSv3; Mac OS, OpenVMS,
Microsoft Windows, Novell NetWare, and IBM’s AS/400 systems. In
recognition of the advantages of interoperability and standardization,
Sun relinquished control of future NFS standards work, and work
leading to NFSv4 was by agreement between Sun and the Internet
Society (ISOC), and is undertaken under the auspices of the Internet
Engineering Task Force (IETF).
In April 2003, the Network File System (NFS) version 4 Protocol
was ratified as an Internet standard, described in RFC-3530, which
superseded NFSv3. This was the first open filesystem and networking
protocol from the IETF. NFSv4 introduces the concept of state to
ameliorate some of the less desirable features of NFSv3, and other
enhancements to improved usability, management and performance.
But shortly following its release, an Internet draft written by Garth
Gibson and Peter Corbett outlined several problems with NFSv4;
specifically, that of limited bandwidth and scalability, since NFSv4
like NFSv3 requires that access is to a single server. NFSv4.1 (as
described in RFC-5661, ratified in January 2010) was developed to
overcome these limitations, and new features such as parallel NFS
(pNFS) were standardized to address these issues.
Now NFSv4.2 is now moving towards ratification. In a change to the
original IETF NFSv4 development work, where each revision took
a significant amount of time to develop and ratify, the workgroup
charter was modified to ensure that there would be no large standards
documents that took years to develop, such as RFC-5661, and that
additions to the standard would be an on-going yearly process. With
these changes in the processes leading to standardization, features
that will be ratified in NFSv4.2 (expected in early 2013) are available
from many vendors and suppliers now.
Adoption of NFSv4.1
Every so often, I, and others in the industry, run Birds-of-a-Feather
(BoFs) on the availability of NFSv4.1 clients and servers, and on the
adoption of NFSv4.1 and pNFS. At our latest BoF at LISA ’12 in San
Diego in December 2012, many of the attendees agreed; it’s time to
move to NFSv4.1.
While there have been many advances and improvements to NFS,
many users have elected to continue with NFSv3. NFSv4.1 is a
mature and stable protocol with many advantages in its own right
over its predecessors NFSv3 and NFSv2, yet adoption remains
slow. Adequate for some purposes, NFSv3 is a familiar and well-
understood protocol; but with the demands being placed on storage
by exponentially increasing data and compute growth, NFSv3 has
become increasingly difficult to deploy and manage. In essence,
NFSv3 suffers from problems associated with statelessness. While
some protocols such as HTTP and other RESTful APIs see benefit
from not associating state with transactions – it considerably simplifies
application development if no transaction from client to server
depends on another transaction – in the NFS case, statelessness has
led, amongst other downsides, to performance and lock management
issues.
NFSv4.1 and parallel NFS (pNFS) address well-known NFSv3
“workarounds” that are used to obtain high bandwidth access; users
that employ (usually very complicated) NFSv3 automounter maps and
modify them to manage load balancing should find pNFS provides
comparable performance that is significantly easier to manage.
So what’s the problem with NFSv3?
Extending the use of NFS across the WAN is difficult with NFSv3.
Firewalls typically filter traffic based on well-known port numbers,
but if the NFSv3 client is inside a firewalled network, and the server
is outside the network, the firewall needs to know what ports the
portmapper, mountd and nfsd servers are listening on. As a result
of this promiscuous use of ports, the multiplicity of “moving parts”
and a justifiable wariness on the part of network administrators to
punch random holes through firewalls, NFSv3 is not practical to use
in a WAN environment. By contrast, NFSv4 integrates many of these
functions, and mandates that all traffic (now exclusively TCP) uses the
single well-known port 2049.
Plus, NFSv3 is very chatty for WAN usage; and there may be many
messages sent between the client and the server to undertake
simple activities, such as finding, opening, reading and closing
a file. NFSv4 can compound these operations into a single RPC
(Remote Procedure Call) and reduce considerably the back-and-
forth traffic across the network. The end result is reduced latency.
3. 16 www.dcseurope.info I April 2013
SNIA
e u r o p e
One of the most annoying NFSv3 “features” has been its handling
of locks. Although NFSv3 is stateless, the essential addition of lock
management (NLM) to prevent file corruption by competing clients
means NFSv3 application recovery is slowed considerably. Very often
stale locks have to be manually released, and the lock management
is handled external to the protocol. NFSv4’s built-in lock leasing,
lock timeouts, and client-server negotiation on recovery simplifies
management considerably.
In a change from NFSv3, these locking and delegation features
make NFSv4 stateful, but the simplicity of the original design is
retained through well-defined recovery semantics in the face of client
and server failures and network partitions. These are just some of
the benefits that make NFSv4.1 desirable as a modern datacenter
protocol, and for use in HPC, database and highly virtualized
applications. NFSv3 is extremely difficult to parallelise, and often
takes some vendor-specific “pixie dust” to accomplish. In contrast,
pNFS with NFSv4.1brings parallelization directly into the protocol; it
allows many streams of data to multiple servers simultaneously, and
it supports files as per usual, along with block and object support
through an extensible layout mechanism. The management is
definitely easier, as NFSv3 automounter maps and hand-created load-
balancing schemes are eliminated and, by providing a standardized
interface, pNFS ensures fewer issues in supporting multi-vendor NFS
server environments.
For more information about the Ethernet Storage Forum and
what the File Protocols Special Interest Group is up to, please visit:
www.snia.or/forums/esf
VN2VN: “Ethernet only”
FCoE is coming
By David Fair,
ESF Business Development Chair, Intel
The completion of a specification for FCoE (T11 FC-BB-5, 2009)
held great promise for unifying storage and LAN over a unified
Ethernet network, and now we are seeing the benefits. With FCoE,
Fibre Channel protocol frames are encapsulated in Ethernet packets.
To achieve the high reliability and “lossless” characteristics of Fibre
Channel, Ethernet itself has been enhanced by a series of IEEE 802.1
specifications collectively known as Data Center Bridging (DCB). DCB
is now widely supported in enterprise-class Ethernet switches. Several
major switch vendors also support the capability known as Fibre
Channel Forwarding (FCF) which can de-encapsulate /encapsulate
the Fibre Channel protocol frames to allow, among other things, the
support of legacy Fibre Channel SANs from a FCoE host.
The benefits of unifying your network with FCoE can be significant, in
the range of 20-50% total cost of ownership depending on the details
of the deployment. This is significant enough to start the ramp of
FCoE, as SAN administrators have seen the benefits and successful
Proof of Concepts have shown reliability and delivered performance.
However, the economic benefits of FCoE can be even greater than
that. And that’s where VN2VN — as defined in the final draft T11 FC-
BB-6 specification — comes in. This spec completed final balloting in
January 2013 and is expected to be published this year. The code has
been incorporated in the Open FCoE code (www.open-fcoe.org).
“VN2VN” refers to Virtual N_Port to Virtual N_Port in T11-speak. But
the concept is simply “Ethernet Only” FCoE. It allows discovery and
communication between peer FCoE nodes without the existence or
dependency of a legacy FCoE SAN fabric (FCF). The Fibre Channel
protocol frames remain encapsulated in Ethernet packets from host to
storage target and storage target to host. The only switch requirement
for functionality is support for DCB. FCF-capable switches and their
associated licensing fees are expensive. A VN2VN deployment of
FCoE could save 50-70% relative to the cost of an equivalent Fibre
Channel storage network. It’s these compelling potential cost savings
that make VN2VN interesting. VN2VN could significantly accelerate
the ramp of FCoE. SAN admins are famously conservative, but cost
savings this large are hard to ignore.
An optional feature of FCoE is security support through Fibre
Channel over Ethernet (FCoE) Initialization Protocol (FIP) snooping.
FIP snooping, a switch function, can establish firewall filters that
prevent unauthorized network access by unknown or unexpected
virtual N_Ports transmitting FCoE traffic. In BB-5 FCoE, this requires
FCF capabilities in the switch. Another benefit of VN2VN is that it can
provide the security of FIP snooping, again without the requirement
of an FCF. Technically what VN2VN brings to the party is new T-11
FIP discovery process that enables two peer FCoE nodes, say host
and storage target, to discover each other and establish a virtual link.
As part of this new process of discovery they work cooperatively to
determine unique FC_IDs for each other. This is in contrast to the
BB-5 method where nodes need to discover and login to an FCF to
be assigned FC_IDs. A VN2VN node can login to a peer node and
establish a logical point-to-point link with standard fabric login (FLOGI)
and port login (PLOGI) exchanges.
VN2VN also has the potential to bring the power of Fibre Channel
protocols to new deployment models, most exciting, disaggregated
storage. With VN2VN, a rack of diskless servers could access a
shared storage target with very high efficiency and reliability. Think of
this as “L2 DAS,” the immediacy of Direct Attached Storage over an
L2 Ethernet network. But storage is disaggregated from the servers
and can be managed and serviced on a much more scalable model.
The future of VN2VN is bright.
For more information about SNIA’s Ethernet Storage Forum, please
visit: www.snia.org/forums/esf