More IOPS PleaseDRMC’s VMware View Implementation Using Nexenta Keith Brennan October 2011 S
Delano Regional Medical Center S 156 bed community hospital in central California. S Four satellite clinics. S Only hospital in a 30 mile radius. S Serves approximately 60,000 people spread over several communities. S 80%+ of our patients are Medi- Cal or Medicare. S Government doesn’t pay well.
The Great Directive of 2009S Need to deploy 150 new desktops in support of a Clinical Documentation implementation.S Do it as cheaply as possible.S Oh, by the way, you’re losing an FTE due to budget cuts.
“Never let a good crisis go to waste.” –Rahm EmmanuelS Used this “Opportunity” to justify moving to VDI. S Users resistant to using something other than a traditional desktop. S Perceived lack of freedom. S Perceived increase in “Big Brother.”S Why I wanted the transition to VDI S Ease of management. S We had a set, well defined, integrated, desktop experience. S Wanted a way to deliver the same experience in a controlled manner to a myriad of devices. IOS, Android, etc.
I Need Storage!S My Existing EMC CX500 was barely cutting it for 3 ESX hosts w/ a combined 32 VM’s.S Lots of people on the Virtualization forums liked NetApp.S NetApp had just published a white paper on a 750 View virtual desktop deployment on a FAS 2050a. S Near normal desktop load times. S Seamless user experience.
Well That’s Timely!S The next week another vendor calls letting me know that IBM is running a huge storage sale.S It includes their N series of network attached storage. S Rebadged NetApps.S Three weeks later a N3600, a rebadged NetApp 2050a, arrives.S It is setup identically to the VDI whitepaper’s setup.
Implementation GuidelinesS Linked clones are to be used whenever possible. S Ease of maintenance S Ease of provisioningS No user data to be stored on the VM’s.S Significant patching shall be done through the Golden Image and VM’s will be re-provisioned with using the updated image.S AV will run on the VM’s but only in real-time scan mode. No scheduled system scans.
Initial TestingS Two Hosts with 25 VM’s each. S One connected to the N3600 via ISCSI S The other via NFS.S Test lab of 25 thin clients.S Good performance. S Equivalent to a desktop of the previous generation. S Quick user logins due to the VM’s being always on and waiting. S The N3600 is maintaining low utilization. S NFS and ISCI exhibit similar speed.
Go Live!S Five additional ESX Hosts are deployed. S Each hosts ~25 VM’s S Current setup gives me N+2 host redundancy.S For the first week everything looks good.S User complaints are primarily with the clinical application.S N3600 is handling it well. Running at about 35% utilization. S ~1.5k IOPs/Sec of regular background chatter. S VM’s report average latency of 12ms.
Disaster! For me they happen seem to happen in threes.S First AV engine update happens 1 week after go live. S AV server pushes it to all clients at once. S The simultaneous update of all the View VM’s forces the SAN to a crawl for 3 hours. S Users complain that the Virtual Desktops are unusable. S Temporarily corrected the problem by only allowing the AV to update 3 machines at once. S This worked like a champ until a dot version update on the AV server a month later broke that setting. S Another 3 hour “downtime.”
Disaster (cont)S Three days later a helpdesk tech forces the simultaneous reprovisioning of 60 of the View VM’s at once. S Was applying an application patch. S Was trained not to restart more than 5 VM’s at once. S That obviously didn’t stick! S That was another hour of the SAN crawling. S Once again, users complain that the system was unusable during this time.
Disaster! (yet again)S .net 3.5 service pack is approved for deployment.S SP is large. >100mb.S Set to deploy starting at 2am and only on restart. S At 04:15 four VM’s restart within one minute of each other. S N3600 starts to lag. S Users seeing their system running slow decide to restart.S At 5am I get the call regarding the issue.S I immediately disabled the SP deployment. S Still took an hour for the N3600 to catch up.
What’s Going On??? S Oh $41+… S General use chatter is eating my bandwidth. S N3600 CPU utilization is regularly now above 50%. S Disk utilization rarely drops below 40%. S Average disk latency >18ms.
I Have a ProblemS I’m maxing performance with just day to day operations.S IBM has verified that the appliance is functioning properly. S In other words, this is all I’m going to get out of it. S Adding disks might help some, but too costly! S Additional Tray would be $15k! S SAS drives to populate it are almost $1k each! S Still have CPU limitations. S NIC Limitations (2 – 1gbe links per head)S Did I mention that I have no money left in the budget?
Nexenta to the RescueS Had just installed Nexenta Core for my home file server.S Time to find some hardware: S Pulled a box out of the View cluster. S Installed six Intel SSD’s. S Installed Nexenta Core. (yeah, I know.. EULA..) S Created the volume and shared via NFS. S The next day my poor brain figured out that I could have just done a Nexenta VM. Doh!S Over the next week I migrated half the virtual desktops over.
Its like Night and DayS Average latency drops from 18ms to 2ms.S Write throughput quadruples.S Read throughput doubles.S 20x improvement on 4k iops!
Time For a Full Nexenta ImplementationS I was able to secure $45k capital for the next year. S Normally this would just draw laughter when talking about storage.S I also intend on replacing the existing EMC. S Annual maintenance too costly. S I despise the fact that I have to call them out every time I want to connect a new piece of hardware to it.S Still some questioning from higher-ups on this whole open- storage thing.
Final Solution HardwareS 2x Supermicro dual Xeon servers with 96gb ram.S 1x DataOn 1600 JBOD S Houses twenty one 1tb nearline SAS drives.S 1x DataOn 1620 JBOD S Houses seventeen 300gb 10k rpm SAS drivesS 2x Stec ZeusRamS 8x 160gb Intel 320 SSD’s
Why DataOn?S Disk Shelf Manager S One thing Nexenta lacked was a way to monitor the JBoD’s S How could one of my techs know how which drive to pull?S Intuitive slot lighting.S They’re responsive even after the sale is made!
Why Nexenta?S Its good to have on demand support. S I am the only member of our technical staff that has a basic understanding of storage architectures. S I like to have the ability to go on vacation from time to time!S Its good to have experts for unique problems.S Regular tested bug-fixes.S Its always nice to have someone’s neck to wring!
The End ResultS 2ms latency.S 500 mb/s readsS 200 mb/s writesS Happy Users!S Note: Benchmark was done on production system with 175 active VM’s.
To Dedup or Not to DedupS Dedup can give you huge storage savings. S I had 14x Dedup ratio on my VDI volume.S Inline dedup saves on disk write IO. S It’ll still hit the ZIL, but won’t be written to disk if it is determined to be duplicated data. S Instead of a 4+kb write you get a sub 256 byte metadata write.
To Dedup or Not to DedupS Ram Hog! S For good performance you need enough ram to store the dedup table. S Uses ARC for this, which means you will have less room for cached data.S Potential for hash collision. S Odds are astronimcal, but still a chance for data corruption.S Dedup performance penalty. S Small IOPS suffer.
Dedup Perfomance Penalty Dedup Enabled No Dedup
Is Dedup Worth it?S If you’re using a “Golden Image” - No. S VMDC Plugin provides great efficiency by only storing one copy of the Golden Image vs one for each pool of VM’s. S Compression is virtually free and will do a good job of making up the difference in the “new” blocks. S Disk is cheap.S If you’re doing a bunch of P2V desktop migrations - Maybe. S If the desktops are poorly configured, or have other aspects that can cause excessive I/O than no. S If the desktops are similar and large, then sure.
CompressionS Use it. Unless you’re using a 5 year old processor, there will be no noticeable performance hit. S On by default in Nexenta 3.1 S Compresses before write. Saves disk bandwidth!
Cache is Key!S Between the the 70gb of arc and 640gb of l2arc the read cache is hit almost 98% of the time!S This equates to sub 2ms average disk latency to the end user.S Beats the crud out of the >15ms average latency of the N3600!S Know your working set. You could get away with a lot smaller or need a lot larger cache.
Gig-E vs TenGig-ES Obvious differences in maximum throughput.S Small IOP differences are mainly attributable to network latency differences.S If you’re stuck with Gig-E go use 802.3ad trunk groups. S Still stuck with 100 mb/s throughput but no one ESX host will saturate the link for the rest.
Gig-E vs TenGig-E - User PerspectiveS Average time from the “Power On VM” command being issued to the user is able to login: S 10gbe: 23 seconds S 1gbe: 32 secondsS Time from when user presses “login” button until the desktop is ready to use: S 10gbe: 5 seconds S 1gbe: 9 seconds*Windows 7, 2 procs, 2gb ram, DRMC’s Standard Clinical Image
Final Thought – All SSD GoodnessS For deployments of Linked Clones or VM’s off of a Golden Image.S Allows you to get rid of the L2ARC.S Use a good ZIL Device (STEC ZeusRam, DDRDrive) S Allows for sequential writes to the SSD’s in the pool. S Saves on write wear which is a SSD killer. S My first test box with the x25m SSD’s started suffering after about 3 months.S If you want HA you have to use SAS drives.