1. How to trace ODX [FSCTL_OFFLOAD_READ & FSCTL_OFFLOAD_WRITE API Calls] transfer on NetApp Clustered-ONTAP
Note: This document is purposely written to demonstrate 'How to trace ODX' transfer'. This document does not intend to test the performance benefits of ODX for some valid reasons mentioned below.
The reason for not testing functional efficiency of ODX in this exercise is b'cos: I run all my applications on top of VMware workstation, which in turn runs on top of Microsoft Windows 8 client operating system and hence it will make no difference whether I am using ODX capability or not, in the worst case, performance may even degrade. Anyway, the whole point here is to show 'how to trace' or understand ODX token based technology in a simulator lab environment. Hopefully, this will enable you to trace ODX FSCTL functions in procmon.
However, I have mentioned certain figures taken from the ESG Lab validation report for DELL Equallogic Storage Array towards the end of the document in order to appreciate the technological difference it brings to the data transfer time as compared to the traditional host based transfer. If you are still interested to know the functional efficiency of ODX then you need to either 'mine' the ESG Lab validation report or go to individual ODX-compliant storage Arrays of your choice, this could be NetApp, EMC, DELL, Hitachi, Nimble, HP, IBM etc.
Ashwin Pawar
Dec, 2014
ashwinwriter@gmail.com
2. Ingredients required:
In this demo I am tracing ‘ODX’ for simple file copy transfer using robocopy between the two NTFS formatted LUNS mapped to Windows 2012R2 hosts via iSCSI to the backend Clustered ONTAP. Please note - I am not tracing ODX in a HYPER-V environment.
1. NetApp Clustered-ONTAP Simulator [ODX is not supported on 7-mode ONTAP operating system]. You can download NetApp Clustered-ONTAP Simulator from the following link provided you have a valid support account.
http://mysupport.netapp.com/NOW/download/tools/simulator/ontap/8.X/
[Only Clustered Data ONTAP 8.2 and later releases support ODX for copy offloads. Choose whatever version you like, at the time of writing this document, the most current version for Clustered ONTAP simulator is 8.2.1]
Device Feature Not Supported is reported in ONTAP 7-mode
You may choose to run the simulator on either VMware workstation or within the ESXi server; it’s totally your choice. I do most stuff on top of VMware workstation primarily b’cos of the computational limitation of my laptop, and also b’cos my intention is to solely understand the application or test features. That is it!
2. Server: Microsoft Windows Server 2012/2012R2 [You can download the trial version from the Microsoft] connected to Clustered-ONTAP Simulator via iSCSI.
http://technet.microsoft.com/en-gb/evalcenter/hh670538.aspx
Optional: Client: Windows 8 onwards supports ODX as well.
3. The key ingredient - Microsoft Process Monitor also called procmon in short.
http://download.sysinternals.com/files/ProcessMonitor.zip
Note: When transferring data between two volumes, ODX only works between volumes that are hosted on the same Storage Array (single or dual controller). Even if two volumes from different Storage Array are mapped to the same server, data transfers will take place using the traditional buffered copy operation between the two volumes.
3. Microsoft ODX
With Windows Client 8 & Server 2012, Microsoft introduced to the world - ODX [Offloaded Data Transfer]. ODX is a data-copy offload technology, designed to offload data-copy functions to *intelligent storage arrays in much the same way that VMware does with VAAI.
In other words, like Windows is to ODX, VAAI is to VMware. The difference is that VMware has been doing it for longer, and therefore has more capabilities and perhaps more matured, whereas Microsoft introduced this technology with Windows 8 & 2012 release onwards. However, the differences will be eventually event out.
Like VAAI primitives, ODX is introduced to speed up certain storage-based operations such as large copy operations and BULK- Zeroing, while at the same time saving on HOST's CPU, NIC utilization, and Network bandwidth.
*Storage devices with SPC4 & SBC3 specification implementation. ODX utilizes T10 XCOPY LITE primitives, making it standards based.
ODX works with following technologies
Hyper-V virtual hard disks (VHD)
SMB Shares (Also referred to as CIFS)
Physical disks
FC,iSCSI,FCoE,SAS
4. When ODX is invoked
ODX is invoked, when copy command is issued from:
CLI
PowerShell prompt
Windows Explorer
Robocopy
Hyper-V Migration Job
Note: As long as you are running Server Windows 2012 / Client Windows 8 or later and the underlying storage array supports ODX.
Why do we need ODX?
In order to find out why, we need to look at the existing data transfer methodology we have at our disposal, so let's see how traditional data transfer occurs:
Traditional host-based data transfer, as shown in figure below
1. The data to be transferred is read from the storage through the source server.
2. Then transferred across the network to the destination server.
3. Finally, written back to the storage through the destination server.
5. Why this is painful?
Well, the transfer begins at the storage level, travels through the Host server consuming ‘host-server resources’, over a LAN consuming ‘network resources’, through the destination- server, consuming ‘destination-server resources’, and finally lands back in the storage. Now that’s a lot of hopping…phew!.
Now that we understand these limitations, let's appreciate how Microsoft’s 'ODX' transfer works under the hood.
ODX-compliant data transfer, as shown in figure below
1. Offload read request is sent from Windows.
As part of the offload read request -
Data is read on the Storage Server & Good status is returned to Windows.
Windows requests ROD token from Storage server.
2. Storage server returns ROD token to Windows.
3. Offload Write request is handed to the remote Windows server.
4. Remote Windows server sends offload write request to the Storage server.
5. Storage Server performs the copy of data locally within Storage Array.
6. Storage server returns Good status to Windows.
6. Note: ROD stands for 'Representation of Data' & ‘ROD Token’ together serves as a point-in- time representation of that data. Following is the typical ROD token format.
Note: A ROD token can be a vendor-specific 512-byte string that represents the 'data range' to the copied. ODX can also be used to perform Bulk-Zeroing, to do this, well known 'ZERO' ROD token is used.
ODX Limitations
As with any new feature, there is bound to be some limitations, which eventually do gets fixed as time goes by and it becomes more matured. At present with the initial release of ODX COPYOFFLOAD, certain things like deduplication, Windows BitLocker-encryption & DFS mini-filter drivers do not work. However, future versions of these filter drivers will potentially be made ODX aware by Microsoft.
7. Steps to trace 'ODX' transfer:
On the Host: [Enable ODX]
1. Make sure ODX is enabled on the Windows 2012 Box. FYI - ODX is enabled by default in Microsoft Windows Server 2012 & onwards.
a. To cross-check, open a Windows PowerShell session as an administrator
b. Check whether ODX is currently enabled (it is by default) by verifying that the FilterSupportedFeaturesMode value in the registry equals 0.
To do so, type the following command as shown in figure below.
Get-ItemProperty hklm:systemcurrentcontrolsetcontrolfilesystem -Name "FilterSupportedFeaturesMode"
To disable ODX set the registry value to ‘1’.
On the Host: [Ensure that all the mini-filter drivers are opted-in to ODX for the given volume]
Basically, the volume which you want to use ODX, list all of the file system filter drivers. To do so, open a Windows PowerShell session as an administrator, and then type the following command:
The ‘SupportedFeatures’ registry value contains an entry. If it is ‘3’ as in the above FLTMC output, is supports ODX. Hence, for each filter driver listed above, look for the value ‘3’ in the SprtFtrs’ column.
8. So, what do you mean by ‘opted-in’: This means that all the mini-filter drivers attached to the given volume must support ODX, otherwise ODX copy will not work and system will fall back to the traditional copy method.
According to Microsoft, ODX is not supported by the following file system mini-filter drivers:
Data Deduplication [In the figure above Dedupe does show value ‘3’ as opted-in ?]
BitLocker Drive Encryption
DFS driver [Not mentioned in the Microsoft KB, but it is noted]
According to Microsoft support community post: Dedupe doesn’t support ODX on files that are being or have been de-duplicated. There are legitimate cases where certain files or even volumes are excluded from deduplication. If we don’t declare the support for ODX in our manifest ODX will be automatically disabled even in this “pass-through” mode of Dedup, which we believe is not in customer’s best interest.
It has also been observed that the DFS driver also breaks ODX; hence you need to make sure this driver is not attached to the volume during ODX offload copy operations.
On the storage side:
You must be running NetApp Clustered Data ONTAP 8.2 and later releases. In this test demo, I am running Clustered Data ONTAP 8.2.1 Simulator; let me take you through the simulator.
1 Open NetApp OnCommand System Manager:
As you can see I have a cluster named - FILER-CM.TEST.COM with an IP address of 192.168.0.75.
Simply double-click to open the GUI.
9. As you can see, it’s a ‘2’ node Clustered ONTAP simulator running version 8.2.1.
One can even ssh to the cluster name or IP address and use CLI as shown below.
2 Inside the cluster I have created a VSERVER [now called as Virtual Storage Machine] CM02_DATA as shown below.
On this VSERVER I have carved out two Volumes from the same aggregate.
10. 3. On top of two volumes, I have created two LUNS respectively. Each of these LUNS are presented to the two hosts running Windows 2012R2 operating systems and formatted with NTFS file systems.
Source server name : WIN2K12R2
Dest. server name: VCENTER
One can even run the following command to check if the two Windows 2012 hosts have logged in successfully to the Clustered ONTAP simulator.
To trace ODX transfer I needed some data in the first place, and hence I copied 270+MB worth single file into the folder named ‘share’ on the NTFS volume ‘E:’ on LUN attached to server named ‘WIN2K12R2’ . I have created similar folder but empty one on the LUN hosted on the other 2012 server named ‘VCENTER’ as shown below. I have made sure that both folders on the respective machines are ‘shared’.
According to ODX specification - The files must be 256 KB or larger – smaller files are transferred using a traditional (non-ODX) file transfer.
11. Please note - When the source LUN and destination LUN are mounted with the file system as in our case, the copy app automatically calls FSCTL_Offload_Read and FSCTL_Offload_Write to perform data transfer from the source LUN to the destination LUN.
4. Start the Procmon process by double-clicking on the procmon ICON on the desktop.
5. Now, let’s execute the robocopy file transfer while we have ‘procmon’ activity running in the background on the source server, I am using standard ‘robocopy’ tool to copy single file of size 270+MB.
From Windows PowerShell, I executed the following robocopy transfer as shown in the example below:
You can trace ‘copy-offload’ FSCTL API calls while data transfer is going on, or at the end of the transfer, it’s up to you. B’cos procmon continuously traces everything in the background and it can get very large with continuous updates, therefore it could be little time consuming to trace the two FSCTL functions such as OFFLOAD_READ & OFFLOAD_WRITE in the long continuous trace.
The trick here is to note the time when the actual robocopy transfer kicks-off and then trace the ‘procmon’ activity from there on.
12. My thoughts
Procmon is a very handy tool for testing as well as troubleshooting purpose. Once you become friendly with this tool, you will benefit a great deal.
How easy is it to adapt to Clustered Data ONTAP?
To be very honest, it’s not that difficult. If you have worked on the 7-mode ONTAP for some time [few years] and you understand the basic CLI know-how then it should be a smooth transition. With clustered mode CLI experience becomes much richer & easier, anyone who have worked on 7-mode would know that, one has to remember 7-mode commands or type help and/or keep a guide by the side to get familiar with the most common ONTAP commands.
However, with clustered ONTAP, comes ‘Command-line completion’ (also called tab completion). This is useful in several ways, commonly accessed commands, especially ones with long names, require fewer keystrokes to reach. Commands with long or difficult to spell filenames can be entered by typing the first few characters and pressing a completion key.
With tab completion you can dig-in and mine just about everything using CLI, which wasn’t possible earlier with 7-mode. I would really encourage 7-mode users to slowly switch to Clustered ONTAP, and the best possible way to achieve this is through – SIMULATOR practice.
NetApp, like any other storage vendor is basically a Software company even though they are called as Storage Hardware Vendor, but the key is software [Data Management Code] which makes all the difference. All the storage vendors make use of the same Hardware under the hood, which is SAS, SATA & SSDs, it’s all the same. What makes them special is the ‘code’ [Firmware/Software]. Therefore a little bit of effort in setting up Clustered lab environment and some practice will go a long way in easing your nerves around Clustered ONTAP.
13. ESG Lab report on ODX with Dell Equallogic
ESG Lab testing on Widows 2012 with ODX compliant DELL storage Array resulted in 8x times faster copy when compared to traditional host based transfer. This report is available on the internet for download; you can google if you wish.
Test Lab environment:
Two servers were connected to an ODX-compliant Dell Equallogic storage array.
The storage array consisted of:
12 x 600GB SAS drives with single RAID5 pool was created with two volumes, one of which contained a 75GB VM, and the other was empty.
ESG Lab transferred a VM using the traditional non-ODX method and the new ODX method. The Lab monitored network utilization and elapsed time for the transfer to complete in both test cases. The results are shown in Figure below:
What the Numbers Mean
1. The ODX transfer took approximately six and half minutes for the VM to completely migrate to the other server and the average network bandwidth consumption was around 64Kb/sec.
2. Using the Non-ODX method, moving the 75GB VM over the network took approximately 52 minutes and consumed 4Mb/sec of network bandwidth.
3. The ODX method completed eight times faster than the non-ODX method while consuming virtually no server CPU or network resources.
14. Recommended Microsoft publications
Introduction to Offloaded Data Transfers (ODX)
http://msdn.microsoft.com/en-us/library/windows/hardware/jj248724.aspx
Offloaded Data Transfer (ODX) with Intelligent Storage Arrays
http://msdn.microsoft.com/en-us/library/windows/hardware/hh833784.aspx
Hardware & Software requirements for ODX, and how to verify the performance of ODX after you implement it:
http://technet.microsoft.com/en-gb/library/jj200627.aspx
Windows Offloaded Data Transfers Overview
http://technet.microsoft.com/en-us/library/hh831628.aspx
ODX requirement with Clustered Data ONTAP
https://library.netapp.com/ecmdocs/ECMP1196891/html/GUID-CD068AE0-CA5F-4920- 8457-471797A9981D.html
Courtesy: Microsoft, Dell, ESG & NetApp
Ashwin Pawar
Dec, 2014
ashwinwriter@gmail.com