Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ONTAP storage network flow control


Published on

This article is about the flow control dilemma on 10G Ethernet NIC , should it be enabled or disabled ?

Published in: Technology
  • Be the first to comment

  • Be the first to like this

ONTAP storage network flow control

  1. 1. Flow Control dilemma and NetApp’s changing recommendations “Turn the flow-control valve anti-clock wise in order to disable it on the NetApp storage” Just joking… Please note: views, thoughts, and opinions expressed in this article belong solely to the author, and not necessarily to the author's employer, organization, committee or other group or individual. If you spot any incorrect information in this article, please feel free to correct me.
  2. 2. Time line of changing recommendations JAN 2013 | TR-3802 - Disable flow-control: Between SWITCH & STORAGE MAR 2015 | TR-4392 - Disable flow-control: END-2-END: Server/ESX <> SWITCH <> STORAGE JAN 2016 | TR-4182 - Disable flow-control: on Cluster Ports only, rest you (fu* Off)decide!
  3. 3. This article recommends ‘dilema’ # 1 Philosophy of dilema # 1 = Let the flow control be managed higher up the stack in the form of congestion control. This can be done by applications much better as hardware based flow control is not application aware. Application level: A TCP connection uses the end-to-end connection to determine the window size used, which can take into account the bandwidth, buffer space, and round trip time and can deal with it much efficiently. Hardware level : The switch port or NIC decides when to send a PAUSE frame and for what duration while only taking into account the link between SWITCH & STORAGE, unfortunately 'No upper level protocols are considered'. For these reasons, it's recommended to disable flow-control on SWITCH Ports and STORAGE NODE NICs. In a simple [Similar Hardware] & smaller networks, flow-control method may work well. However, with the introduction of larger and larger networks along with more advanced and faster network equipment and software, technologies such as TCP windowing, increased switch buffering, works better. Please note, the recommendations mentioned here are purely based on the theoretical assumption that the – “Flow control are handled better higher up the stack”. WISDOM: flow-control can only be disabled for dedicated 10G Ethernet NIC; flow-control is not applicable to Converged Network Adapter (CNA/UTA) cards, where it cannot be disabled. If you disable flow control on the switch port, flow-control is automatically disabled for devices such as CNAs. You may not realize it but depending upon the SWITCH settings it may be already set to ‘none’ b’cos SWITCH Port is set to ‘none’. You may be looking at the VLAN or IFGRP Port which might show ‘Full’, and you might think that the flow-control settings for 10G Port is ‘full’, but you can safely ignore it, b’cos you are looking at a wrong place, as long as ‘flow-control’ for the Physical Ports shows ‘none’, you are done. You can use the ‘switch’ –type physical with ‘port show’ command to look for only physical ports: ::> port show -fields type, flowcontrol-admin,flowcontrol-oper -speed-oper 10000, -type physical
  4. 4. What is Ethernet flow control? Ethernet flow control is a layer 2 network mechanism that is used to manage the rate of data transmission between two endpoints. It provides a mechanism for one network node to control the transmission speed of another so that the receiving node is not overwhelmed with data. MOST IMPORTANT INFO for NetApp: You can modify the MTU, autonegotiation, duplex, flow control, speed, and health settings of a physical network port only; you cannot modify any of these for VLAN or IFGRP. The only parameter that you can modify for VLAN or IFGRP is the MTU size. Difference between flow-control admin & operational value:  Flow-control-admin is the administrative value that you have control over it and it is configured on the STORAGE NODE’s Physical Ports.  Flow-control-oper is the operational state of the flow-control as reported after negotiation with the SWITCH PORT, for which you have no control over it. Hence, if you disable flow-control on the Physical Ports on the storage side, but flow-control-oper still says FULL, it means the Network Switch needs to be updated to have flow-control fully disabled, or set to ‘none’. Best practice recommended by NetApp: What are the flow control best practices for 10g Ethernet? Historically (7-Mode/clustered Data ONTAP): NetApp had recommended that flow control be disabled on all network ports [cluster & data] within a NetApp Data ONTAP cluster. This approach is no longer the case. Guidance in this area has since changed; the new recommended best practice is as follows:  Disable flow control on cluster network ports in the Data ONTAP cluster: Flow-control on cluster ports are correctly set to 'none'.  Flow-control on the remaining network ports (the ports that provide data, management, and Intercluster connectivity) should be configured to match the settings within the rest of your environment. i.e it should either be 'none/receive/send/full' end-2-end. However, NetApp SMEs still recommends disabling flow-control for normal data ports as well citing performance improvements reported by various clients.
  5. 5. Why should flow control be disabled in clustered Data ONTAP?  First: Buffer limitations on some switches.  Second: More data, better hardware.  Third: Congestion control. For information on each bit, read the following post [Courtesy: Justin Parisi] The general idea is to let the flow control be managed higher up the stack in the form of congestion control. Maintenance window recommended: Keep in mind that changing flow control on a port will result in a brief blip in connectivity, as the port will reset to read the new configuration. Therefore it is best advised to change the flow control in a maintenance window. IMPORTANT: Flow-control should be disabled throughout the network, i.e from source to destination. Otherwise, it will not bring any benefits and may even worsen the performance. Attention: Please make sure switch ports are also set to disable/none for the flow-control, to match with the flow-control settings on storage Ethernet ports. In the following exercise we are disabling flow-control on the Physical 10G dedicated Data Port serving CIFS Steps to set the flow-control settings to ‘none’: 1. Run the following commands to identify the Physical Ports for which you want to disable the flow-control: 2. We can start with identifying LIF, VLAN & ifgrp that the CIFS LIFs belongs to: LIF: ::> network interface show -vserver svm This command output will provide the VLAN name [Provided VLAN exists] VLAN: ::> vlan show This command will provide you the IFGRP name [Provided IFGRP exists] IFGRP: ::> ifgrp show Finally, this command will provide you the names of the Physical Ports that we intend to know.
  6. 6. From the command output of the above mentioned commands, we would have obtained the required information necessary for changing the flow-control on ports used for CIFS: LIF: cluster-01_cifs_1 [Used for serving CIFS connections] – This LIF is sitting on a IFGRP.  IFGRP: a0a : e0c & e0d [Physical Ports bonded together to form a VIF/IFGRP]  VLAN : a0a-189 [To which this IFGRP belongs] Please note: It’s only the ‘Physical 10G Ports e0c & e0d’ that we are concerned with, b’cos as stated earlier, you can only change the flow-control settings for Physical Ports. Run the following command to note the current 'flow-control' settings: ::> port show -fields type, flowcontrol-admin,flowcontrol-oper -speed-oper 10000 –type physical Steps to disable flow-control: 1. Migrate the LIF which is serving CIFS to Partner Node. First make sure auto-revert is set to false, i.e until you finish the task. ::> net int modify -vserver svm –lif cluster-01_cifs_1 -auto-revert false Migrate the LIF to Partner Node, before you go ahead. ::> net int migrate -vserver svm -lif cluster-01_cifs_1 -dest-node cluster-02 -dest-port xx 2. Remove: It is recommended to remove the physical ports one by one from the IFGRP and only then set the flow-control to 'NONE' and then add it back to IFGRP. IFGRP : a0a consists of Physical Ports: e0c & e0d. a. Remove e0c from IFGRP a0a ::> ifgrp remove-port -node clust-01 -ifgrp a0a -port e0c b. Set the flow control to 'none' ::> network port modify -node clust-01 -port e0c -flowcontrol-admin none c. Add e0c back to IFGRP a0a ::> ifgrp add-port -node clust-01 -ifgrp a0a -port e0c Repeat the exercise for e0d and then migrate the LIF back to clust-01, repeat the same for partner Node. 3. Once it is done, verify to ensure that the flow-control is indeed indicating 'none' which means disabled. ::> port show -fields type,flowcontrol-admin,flowcontrol-oper -speed-oper 10000, -type physical July, 2018