Spinning Brown Donuts: Why Storage Still Counts


Storage, next to server hardware, is pretty commoditized and probably the least exciting thing in your datacenter. However, not properly assessing your storage needs and requirements can be the difference between a great app or resume generating event. This session will cover topics such as: Why you may not need all flash, SAN is not just NAS spelled backwards, leveraging cloud storage, why RAID is not a sound backup solution, and cutting through the marketing to make sense of it all.

Spinning Brown Donuts: Why Storage Still Counts

  1. 1. SPINNING BROWN DONUTS Why Storage Still Counts Presented by: David Pechon, Jr. MCSA, VCP5-DCV
  2. 2. 2
  3. 3. WHO DOES THIS GUY THINK HE IS? Started IT career with an enlistment in the US Army in 1997 as an Information Systems Operator/Analyst. Stationed at Fort Polk, LA; Youngsan Army Garrison in Seoul, South Korea; and Fort Bragg, NC. (never airborne, instead a dirty nasty leg) Worked for a loan servicing company and three different banks in SE Louisiana, as well as a consulting business from a small MSP in New Orleans to a large systems integrator based in Denver. Started working at Sparkhound in February 2014, specializing in virtualization, storage, messaging and identity management Held certifications from Microsoft, VMware, NetApp, CommVault and SyncSort (now Catalogic Software). Married to my wife Clare of 8 years with two children and currently resides in Ponchatoula, LA Avid Chicago Cubs fan; enjoys fish, fine beers and grilling outdoors. Fun fact: My face was on the Today show in 1991 for a full five seconds when Joe Garagiola visited my school at Fort Stewart, GA. @davidpechon
  4. 4. … AND WHY IS HE SO EXCITED ABOUT STORAGE? As workloads become increasingly virtualized, storage becomes more and more of a potential storage bottleneck, many technologies have been produced to reduce impact. The amount of data generated has grown exponentially with no signs of slowing down. Information is an asset to any organization. There are needs to make sure its secure and available at all times.
  5. 5. BRIEF HISTORY OF DATA STORAGE 1948 Williams Tube Early 1950s – Drum Memory 1951 - Uniservo 1956 – IBM 350 First HDD 1972 – Data Cassette 1976 – Floppy disks 1983 – ST-506 First PC HDD 1990s Optical Media 2010s Cloud 2000s USB Flash 1725-1940s: Punch Cards
  6. 6. RAID IS NOT A BACKUP Seriously…
  7. 7. RAID IS NOT A BACKUP Anyone who thinks RAID is a backup should be swatted on the nose with a rolled up newspaper…. …and laughed at too. RAID is used to span storage load across spindles and/or survive a disk failure. RAID will not protect against rouge admins, stupid admins, stupid users, users in general, users looking to get out of congressional hearings, viruses, Decepticons, the guys the Go-Bots fought against, vampires, fire, earthquakes, nuclear apocalypse …. well you get the idea.
  8. 8. STUPID IS AS STUPID DOES One official wrote me … “there are criminal penalties for destroying federal records, which makes sense, including liability for negligence for not taking the necessary steps to protect files, including a federal requirement to backup data. This doesn’t happen. All email servers are backed up with something called ‘RAID’ (Redundant Array Of Independent Disks), and it’s nearly impossible for something to delete the files, and that even if that were to happen they would not be gone forever.” Source: D. Giordano (June 16, 2014) Attkisson On Missing IRS Documents: If The Emails Really Are Lost, ‘That’s Quite A Story In Itself’. Retrieved from missing-irs-documents-if-the-emails-really-are-lost-thats-quite-a-story-in-itself/
  9. 9. BREAK IT DOWN … BARNEY STYLE RAID does not protect against deletion, be it accidental or intentional. RAID does not protect against data corruption. Some RAID levels will not protect you against disk loss, all will not protect you against other catastrophic failures. Remember Kids!! RAID is not a backup!
  10. 10. TRADITIONAL DATACENTER STORAGE Why SAN is not just NAS Spelled Backwards
  11. 11. IN THE BEGINNING... STORAGE AREA NETWORK Storage Device A SAN shares virtual disks from an array to a host. In this example, a fabric is being used. Storage is presented to a host as raw block storage. DIRECT ATTACHED STORAGE Direct attached storage is basically disks attached directly to a host via a storage controller card. While performance can be great, flexibility is low in creating islands of storage. A NAS hosts files over network shares. Storage is mapped to hosts. Was created to share information between computers over standard data networks. NETWORK ATTACHED STORAGE Storage Device File System (CIFS, NFS, etc.)Network Storage
  12. 12. NAS PROTOCOLS Server Message Block (SMB) a.k.a.: Common Internet File System (CIFS) Network File System (NFS) Primarily used by Windows to share files over a network. Supported by MacOS. Can be used by UNIX/Linux distributions with third party tools like Samba Developed by Sun Microsystems to share files with other Solaris systems. Primarily used by UNIX and UNIX like operating systems. Windows Server 2012 can act as an NFS server natively. Latest version SMB 3.0 supports hardware acceleration and multipathing. Latest version NFS 4.1 supports multipathing and supports parallel writes for applications like high performance computing (pNFS) Hyper-V 3.0 can use SMB 3.0 shares to store VMs in a cluster. Only NAS protocol supported to store Microsoft Exchange mailbox databases on virtual disks NFS 3 is supported by all vSphere versions. NFS 4.1 is supported by ESXi 6/vSphere 6.
  13. 13. SAN PROTOCOLS Fibre Channel iSCSI Requires special FC switching and cards called Host Bus Adaptors or HBAs. Configured in a fabric configuration to minimize failure points and increase data paths. Uses existing Ethernet/IP infrastructure, can use either software initiators or HBAs. Developed to go beyond the SCSI limits for disk devices and tape drives. Developed as a lower cost alternative to Fibre Channel Lossless protocol to minimize storage latency. Beholden to the loss packets that can occur on an IP network Great scalability and can traverse greater distances by use of dark fiber. In some cases up to 100 kilometers (a tad over 62 miles) While it can go over IP networks, not recommended to go over wide area networks.
  14. 14. BEST OF BOTH WORLDS - FCOE FCoE switches can carry both Fibre Channel and IP networking on the same switches, reducing complexity, cabling, and devices. Converged Network Adaptors replace specific HBAs and can also carry IP and Fibre Channel protocols. Popular in converged architecture sets such as FlexPod, Vblock, ActiveSystem, CloudMatrix, etc. Uses the same architecture and networking practices as Fibre Channel. Ethernet replaces the physical layer.
  16. 16. SO WHAT'S THE DIFFERENCE? • You would need a SAN if… • You need lower latency disk access over a lossless protocol. • You are using higher transaction intensive systems like database management applications, enterprise resource planning, or email systems like Exchange Server • You want to eliminate of single points of failure by use of a fabric network and multiple data paths. • You need to traverse over a campus or even a metro area over fiber. • You would need a NAS if.. • You need lower administrative overhead without the need for special network configuration outside of setting up a VLAN or two. • You want to share files directly to users from the array, eliminating traditional file servers. • You want to cluster storage to scale out performance, not just capacity. (scale-out NAS)
  17. 17. WHAT IF I WANT BOTH?
  18. 18. WHAT IF I WANT BOTH? Unified storage systems that can host both SAN and NAS protocols from the same array, simplifying management and allowing more flexibility. NAS gateways are systems that use a LUN from a SAN to host file protocols. These can be systems that are built for that purpose or a general purpose operating system running on a server.
  19. 19. BEYOND ARRAYS Software Defined, Hybrid, All-Flash, and Convergence
  20. 20. HYBRID ARRAYS Hybrid arrays combine the use of traditional magnetic disk and solid state. The idea came from the method of storage tiering, where blocks of “hot” data are moved to faster disks. While effective for a while, it was basically trying to squeeze blood from a stone. Performance was still limited by mechanical disk speed and scheduling of blocks to be written and when they were deemed hot or cold 15k RPM SAS RAID 10 10k RPM SAS RAID 6 7.2k RPM SATA RAID 6 HOT! COLD WARM HOT! WARM WARM WARM COLD
  21. 21. HYBRID ARRAYS Hybrid arrays combine the use of traditional magnetic disk and solid state. In most hybrid arrays, hot data is cached in SSDs or PCI flash in the storage array. Some arrays will use DRAM as a cache level. Data isn’t moved but the array will use metadata to point reads to the cache, known as “cache hits” Some hybrid arrays have the ability to use SSDs as a write cache, to ingest large amounts of data quickly, then move it to slower storage. STORAGE ARRAY SSD SERVER SSD
  22. 22. ALL FLASH ARRAYS Its an array with all flash drives…. …duh. On a serious note, what sets vendors apart are features. Violin Memory is an example of one such array that doesn’t do any special space efficiency, but makes very dense solid state arrays. Pure Storage sacrifices some raw performance for space efficiencies deduplication and compression. Some traditional storage vendors like HP and NetApp, and added all-flash support to their existing storage arrays.
  23. 23. CONVERGED SYSTEMS VMware brought virtualization to commodity x86 computing, bringing the benefits of mainframes to lesser expensive hardware. Fibre Channel over Ethernet allowed datacenters to reduce the amount of networking devices in the datacenter. Cisco UCS platforms had decoupled various hardware settings from systems, allowing you to replace WWN, MAC addresses, BIOS settings, etc. to a new node either hot or cold Unified storage systems such as NetApp FAS and EMC VNX allowed for all storage protocols under one system. This led to the concept of converged systems, where compute, network, storage and hypervisor systems were combined under a validated model, giving the customer one number to call for support, or known as “one throat to choke.”
  24. 24. SOFTWARE DEFINED STORAGE Software Defined Storage is the ability to get the features of a storage array in a virtual appliance rather than hardware or run a storage OS on their own hardware. This has given birth to two disruptive technologies…
  25. 25. HYPERCONVERGED Hyperconverged systems cluster the local storage on virtual hosts by using a storage VM or by the hypervisor itself. HYPERVISOR Storage VM SCSI Controller SSD SSD SATA SATA SATA SATA VMI/O HYPERVISOR Storage VM SCSI Controller SSD SSD SATA SATA SATA SATA VMI/O HYPERVISOR Storage VM SCSI Controller SSD SSD SATA SATA SATA SATA VMI/O Virtual storage cluster This technology has proven to be excellent for applications that linearly scale such as big data and virtual desktop infrastructure.
  26. 26. CLOUD STORAGE Cloud storage can be in the form of user accessible storage such as OneDrive or Dropbox It can be a cold data tier, as used by Microsoft StorSimple It can be used as a replication or backup target, similar to NetApp Cloud ONTAP. Or replicate your entire infrastructure for disaster recovery with services like vCloud Air or Hyper-V/Azure replication.
  27. 27. FLASH! ….ahhh ahhhh….savior of the universe datacenter!
  28. 28. TYPES OF SOLID STATE DRIVES Single level cell flash or SLC NAND* memory stores one bit per cell. It can endure more writes than any other flash memory available and is usually the most expensive. Multi-level cell or MLC flash can do two to three bits per cell but has a shorter lifespan than SLC. Enterprise MLC or eMLC will consist of chips of higher quality, much like how enterprise drives are more reliable than consumer grade. They cost more than consumer MLC SSD drives but less than SLC SSDs. *NAND being a transistor logic gate, which is a negation of the AND operator. NOR logic gates are used in some SLC flash where the logic gate results in the negation of an OR operator.
  29. 29. ALL FLASH OR HYBRID? It all depends! Using metrics to determine cost, such as $/GB or $/IOPS. Do you need sub millisecond latency? Do you want the benefits of flash with a cost somewhat similar to disk? Not all workloads need all flash arrays.
  30. 30. BUZZWORDS AND MARKETING Cutting through the BS
  31. 31. BIG DATA Big Data is basically taking petabytes of unstructured and structured data and turning it into something useful. Storage frameworks like Hadoop make this possible. Hadoop requires an array of nodes that are usually only needed on demand. Amazon EWR and Azure HDInsight are cloud services specifically for provisioning Hadoop clusters in seconds and you only pay for running a workload.
  32. 32. HERO NUMBERS
  33. 33. HERO NUMBERS IOPS stands for Input/Output Operatoins Per Second. Most numbers quoted are based on 4 kilobyte block size and sequential reads, which most drives and arrays can perform quite well. Most applications like SQL Server will require a 64k block size. So this figure readjusted for 64k sequential read would be 12,500 IOPS. Deduplication is the method of removing redundant blocks of storage to save space. Metadata is used to reconstruct data from the deduplicated blocks. This depends all about how much redundant data is being stored. If you have 10 desktops with nothing else installed, you pretty much have the same bits written 10 times, hence the 10:1 dedupe ratio. Databases for instance may only see 10-30% space savings with deduplication. NL-SAS or Nearline SAS are SATA drives that can use SAS backplanes. They’re no faster per spindle than SATA drives. I can imagine Paul Thurott saying something like that.
  34. 34. IN CONCLUSION Sizing storage properly can make or break your line of business applications. A lower cost hybrid array may be sufficient over all flash. You may want to consider cloud storage over an on premise array for cold storage. Never let a vendor tell you how you should run your systems on their storage. A good storage consultant should be able to size an environment not only on your applications but through the entire lifecycle of the appliance. Always ask for a “bake off”, meaning you can test your workloads on their gear before signing a purchase order. Be wary of “hero numbers”, again using a bake off to get a much better picture on how their system will work for you.