Storage Virtualization Introduction


Published on

Published in: Technology
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Taneja “Next-Generation FC Arrays”:Clustered controller designSub-disk virtualizationSelf-configuring and self-tuning storageAutomated storage tieringThin technologies
  • Up to 256 FC or iSCSI LUNsESX multipathingLoad balancingFailoverFailover between FC and iSCSI*Beware of block sizes greater than 256 KB!If you want virtual disks greater than 256 GB, you must use a VMFS block size larger than 1 MBAlign your virtual disk starting offset to your array (by booting the VM and using diskpart, Windows PE, or UNIX fdisk)*
  • Link Aggregate Control Protocol (LACP) for trunking/EtherChannel - Use “fixed” path policy, not LRUUp to 8 (or 32) NFS mount pointsTurn off access time updatesThin provisioning? Turn on AutoSize and watch out
  • Storage Virtualization Introduction

    1. 1. Storage Virtualization Seminar<br />Stephen Foskett<br />Director of Data Practice, Contoural<br />
    2. 2. Part 1:Breaking the Connections<br />Storage virtualization is here, breaking the connection between physical storage infrastructure and the logical way we use it<br />
    3. 3. Agenda<br />What is storage virtualization?<br />Volume management<br />Advanced file systems<br />Virtualizing the SAN <br />Virtual NAS<br />
    4. 4. Poll: Who is Already Using Storage Virtualization?<br /><ul><li>We talk about virtualization like it is new or strange…
    5. 5. …but your storage is already virtualized!
    6. 6. Disk drives map blocks
    7. 7. RAID is as old as storage (conceived 1978-1988)
    8. 8. Modern OSes include volume management and path management
    9. 9. Network-attached storage (NAS) redirectors and DFS
    10. 10. Storage arrays are highly virtualized (clustering, LUN carving, relocation, tiering, etc…)
    11. 11. According to ESG, 52% have already implemented storage virtualization and 48% plan to! (ESG 2008)</li></li></ul><li>The act of abstracting, hiding, or isolating the internal function of a storage (sub)system or service from applications, compute servers or general network resources for the purpose of enabling application and network independent management of storage or data.<br />The application of virtualization to storage services or devices for the purpose of aggregating, hiding complexity or adding new capabilities to lower level storage resources. Storage can be virtualized simultaneously in multiple layers of a system, for instance to create HSM like systems.<br />SNIA Defines Storage Virtualization<br />The act of abstracting, hiding, or isolating the internalfunction of a storage (sub)system or service from applications, compute servers or general network resources for the purpose of enabling application and network independentmanagement of storage or data.<br />The application of virtualization to storage services or devices for the purpose of aggregating, hiding complexity or adding new capabilities to lower level storage resources. Storage can be virtualized simultaneously in multiple layers of a system, for instance to create HSM like systems.<br />
    12. 12. What and Why?<br />Virtualization removes the hard connection between storage hardware and users<br />Address space is mapped to logical rather than physical locations<br />The virtualizing service consistently maintains this meta-data<br />I/O can be redirected to a new physical location<br />We gain by virtualizing<br />Efficiency, flexibility, and scalability<br />Stability, availability, and recoverability <br />
    13. 13. The Non-Revolution:Storage Virtualization<br />Software<br /><ul><li>We’ve been talking about storage virtualization for 15 years!
    14. 14. Virtualization exists for both block and file storage networks
    15. 15. Can be located in server-based software, on network-based appliances, SAN switches, or integrated with the storage array</li></ul>Switch<br />Appliance<br />Array<br />
    16. 16. Introducing Volume Management<br /><ul><li>Volume management = server-based storage virtualization
    17. 17. Volume managers abstract block storage (LUNs, disks, partitions) into virtual “volumes”
    18. 18. Very common – all* modern OSes have volume managers built in
    19. 19. Windows Logical Disk Manager, Linux LVM/EVMS, AIX LVM, HP-UX LVM, Solaris Solstice, Veritas Volume Manager
    20. 20. Mostly used for flexibility
    21. 21. Resize volumes
    22. 22. Protect data (RAID)
    23. 23. Add capacity (concatenate or expand stripe or RAID)
    24. 24. Mirror, snapshot, replicate
    25. 25. Migrate data</li></li></ul><li>Logical Volume Managers<br />
    26. 26. ZFS: Super File System!<br /><ul><li>ZFS (originally “zettabyte file system”) is a combined file system, volume manager, disk/partition manager
    27. 27. Open source (CDDL) project managed by Sun
    28. 28. Will probably replace UFS (Sun), HFS+ (Apple OS X Snow Leopard Server)
    29. 29. ZFS creates a truly flexible, extensible, and full-featured pool of storage across systems and disks
    30. 30. Filesystems contained in “zpools” on “vdevs” with striping and optional RAID-Z/Z2
    31. 31. 128-bit addresses mean near-infinite capacity (in theory)
    32. 32. Blocks are “copy-on-write” with checksums for snapshots, clones, authentication
    33. 33. …but there are some limitations
    34. 34. Adding (and especially removing) vdevs is hard/impossible
    35. 35. Stacked RAID is impossible
    36. 36. There is no clustering (until Sun adds Lustre)</li></li></ul><li>Path Management Software<br />Path management virtualizes the connection from a server to a storage system<br />Failover<br />Load balancing strategies<br />A few choices<br />Veritas DMP (cross-platform, with Storage Foundation)<br />EMC PowerPath (supports EMC, HDS, IBM, HP)<br />IBM SDD (free for IBM)<br />HDS (HDLM)<br />Microsoft MPIO (Windows, supports iSCSI and most FC)<br />VMware Failover Paths<br />
    37. 37. Virtualizing the SAN<br /><ul><li>The storage area network (SAN) is a popular location for virtualization
    38. 38. Can require less reconfiguration and server work
    39. 39. Works with all servers and storage (potentially)
    40. 40. Resides on appliance or switch placed in the storage network
    41. 41. Some are in the data path, others are less so
    42. 42. Brocade and Cisco switches have application blades
    43. 43. Some use dedicated storage services modules (SSMs)</li></li></ul><li>In-Band vs. Out-of-Band<br />In-band devices intercept traffic<br />Out-of-band devices redirect traffic<br />Where’s my data?<br />Where’s my data?<br />It’s over there!<br />I got yer data right here!<br />
    44. 44. SAN Virtualization Products<br />
    45. 45. Virtual NAS<br /><ul><li>File-based network-attached storage (NAS) lends itself to virtualization
    46. 46. IP network connectivity and host processing possibilities
    47. 47. Multitude of file servers? Virtualize!
    48. 48. Global namespace across all NAS and servers
    49. 49. Share excess capacity
    50. 50. Transparently migrate data (easier than redirecting users!)
    51. 51. Tier files on large “shares” with variety of data
    52. 52. Create multiple virtual file servers</li></li></ul><li>NAS Virtualization Products<br />
    53. 53. Transformed Storage Systems<br /><ul><li>Virtualization technology is common in storage array controllers
    54. 54. Arrays create large RAID sets and “carve out” virtual LUNs for use by servers
    55. 55. Controller clusters (and grids) redirect activity based on workload and availability
    56. 56. Snapshots/mirrors and replication are common features
    57. 57. A new generation arrays with virtualization features is appearing, with tiered storage, thin provisioning, migration, de-duplication
    58. 58. Sub-disk RAID = the end of RAID as we know it?</li></li></ul><li>Virtual Tiered Storage<br /><ul><li>Array controllers can transparently move data from low-cost to high-performance disk
    59. 59. Most arrays support multiple drive types
    60. 60. “Bulk” SATA or SAS drives are common (500 GB - 1 TB)
    61. 61. Solid-state drives are the latest innovation
    62. 62. Some arrays can dynamically load balance
    63. 63. A few can “hide” other arrays “behind”
    64. 64. SAN: HDS USP-V and similar from Sun, HP
    65. 65. NAS: Network Appliance vFiler, ONStor Bobcat</li></li></ul><li>Thin Provisioning<br /><ul><li>Storage is commonly over-allocated to servers
    66. 66. Some arrays can “thinly” provision just the capacity that actually contains data
    67. 67. 500 GB request for new project, but only 2 GB of initial data is written – array only allocates 2 GB and expands as data is written
    68. 68. Symantec API, thin-unprovisioning capabilities
    69. 69. What’s not to love?
    70. 70. Oops – we provisioned a petabyte and ran out of storage
    71. 71. Chunk sizes and formatting conflicts
    72. 72. Can it thin unprovision?
    73. 73. Can it replicate to and from thin provisioned volumes?
    74. 74. Thin provisioning is an abdication of our responsibilities!</li></li></ul><li>De-Duplication<br /><ul><li>The next frontier – efficiently storing duplicate content
    75. 75. More appropriate to some applications than others
    76. 76. Software or appliance (and now array!) analyzes files or blocks, saving duplicates just once
    77. 77. Block-based reduce capacity more by looking inside files
    78. 78. Once common only for archives, now available for production data
    79. 79. Serious implications for performance and capacity utilization
    80. 80. In-line devices process all data before it is written
    81. 81. Post-processing systems scan written data for duplicates</li></li></ul><li>“Cloud” Storage<br />Many companies are choosing managed services for servers and storage<br />Lots of managed archive and backup providers<br />Zantaz, Google Postini, EMC Mozy, Symantec SPN, etc<br />Managed storage services is coming into its own (finally!)<br />Amazon S3 and Nirvanix<br />EMC “Fortress”<br />
    82. 82. The Next-Generation Data Center<br />Virtualization of server and storage will transform the data center<br />Clusters of capability host virtual servers<br />Cradle to grave integrated management<br />SAN/network convergence is next<br />InfiniBand offers converged virtual connectivity today<br />iSCSI and FCoE become datacenter Ethernet (DCE) with converged network adapters (CNAs)<br />
    83. 83. Audience Response<br />Question?<br />
    84. 84. Break sponsored by<br />
    85. 85. Part 2:Storage in the Virtual World<br />Responding to the demands of server, application, and business users with new flexible technologies<br />
    86. 86. Agenda<br />Why virtual storage for virtual servers?<br />The real world impact and benefits<br />Best practices for implementation<br />
    87. 87. Poll: Who Is Using VMware?<br />
    88. 88. Poll: Does Server Virtualization Improve Storage Utilization?<br />
    89. 89. Why Use Virtual Storage For Virtual Servers?<br />Mobility of virtual machines between physical servers for load balancing<br />Improved disaster recovery<br />Higher availability<br />Enabling physical server upgrades<br />Operational recovery of virtual machine images<br />
    90. 90. Server Virtualization = SAN and NAS<br /><ul><li>Server virtualization has transformed the data center and storage requirements
    91. 91. VMware is the #1 driver of SAN adoption today!
    92. 92. 60% of virtual server storage is on SAN or NAS (ESG 2008)
    93. 93. 86% have implemented some server virtualization (ESG 2008)
    94. 94. Server virtualization has enabled and demanded centralization and sharing of storage on arrays like never before!</li></li></ul><li>Three Pillars of VM Performance<br />
    95. 95. Server Virtualization Recoil<br /><ul><li>Dramaticallyincreased I/O
    96. 96. Patchwork of support, few standards
    97. 97. “VMware mode” on storage arrays
    98. 98. Virtual HBA/N_Port ID Virtualization (NPIV)
    99. 99. Everyone is qualifying everyone and jockeying for position
    100. 100. Can be “detrimental” to storage utilization
    101. 101. Befuddled traditional backup, replication, reporting</li></li></ul><li>VMware Storage Options:Shared Storage<br />Shared storage - the common/ workstation approach<br />Stores VMDK image in VMFS datastores<br />DAS or FC/iSCSI SAN<br />Hyper-V VHD is similar<br />Why?<br />Traditional, familiar, common (~90%)<br />Prime features (Storage VMotion, etc)<br />Multipathing, load balancing, failover*<br />But…<br />Overhead of two storage stacks (5-8%)<br />Harder to leverage storage features<br />Often shares storage LUN and queue<br />Difficult storage management<br />VM<br />Host<br />Guest<br />OS<br />VMFS<br />VMDK<br />DAS or SAN<br />Storage<br />
    102. 102. VMware Storage Options:Shared Storage on NFS<br />Shared storage on NFS – skip VMFS and use NAS<br />NTFS is the datastore<br />Wow!<br />Simple – no SAN<br />Multiple queues<br />Flexible (on-the-fly changes)<br />Simple snap and replicate*<br />Enables full Vmotion<br />Use fixed LACP for trunking<br />But…<br />Less familiar (3.0+)<br />CPU load questions<br />Default limited to 8 NFS datastores<br />Will multi-VMDK snaps be consistent?<br />VM<br />Host<br />Guest<br />OS<br />NFS<br />Storage<br />VMDK<br />
    103. 103. VMware Storage Options:Raw Device Mapping (RDM)<br />Raw device mapping (RDM) - guest VM’s access storage directly over iSCSI or FC<br />VM’s can even boot from raw devices<br />Hyper-V pass-through LUN is similar<br />Great!<br />Per-server queues for performance<br />Easier measurement<br />The only method for clustering<br />But…<br />Tricky VMotion and DRS<br />No storage VMotion<br />More management overhead<br />Limited to 256 LUNs per data center<br />VM<br />Host<br />Guest<br />OS<br />I/O<br />Mapping File<br />SAN Storage<br />
    104. 104. Physical vs. Virtual RDM<br />Virtual Compatibility Mode<br />Appears the same as a VMDK on VMFS<br />Retains file locking for clustering<br />Allows VM snapshots, clones, VMotion<br />Retains same characteristics if storage is moved<br />Physical Compatibility Mode<br />Appears as a LUN on a “hard” host<br />Allows V-to-P clustering,a VMware locking<br />No VM snapshots, VCB, VMotion<br />All characteristics and SCSI commands (except “Report LUN”) are passed through – required for some SAN management software<br />
    105. 105. Physical vs. Virtual RDM<br />
    106. 106. Poll: Which VMware Storage Method Performs Best?<br />Mixed Random I/O<br />CPU Cost Per I/O<br />VMFS,<br />RDM (p), or RDM (v)<br />Source: “Performance Characterization of VMFS and RDM Using a SAN”, VMware Inc., 2008<br />
    107. 107. Which Storage Protocol is For You?<br />FC, iSCSI, NFS all work well<br />Most production VM data is on FC<br />Either/or? - 50% use a combination (ESG 2008)<br />Leverage what you have and are familiar with<br />For IP storage<br />Use TOE cards/iSCSI HBAs<br />Use a separate network or VLAN<br />Is your switch backplane fast?<br />No VM Cluster support with iSCSI*<br />For FC storage<br />4 Gb FC is awesome for VM’s<br />Get NPIV (if you can)<br />
    108. 108. Poll: Which Storage Protocol Performs Best?<br />Throughput by I/O Size<br />CPU Cost Per I/O<br />Fibre Channel,<br />NFS,<br />iSCSI (sw),<br />iSCSI (TOE)<br />Source: “Comparison of Storage Protocol Performance”, VMware Inc., 2008<br />
    109. 109. Storage Configuration Best Practices<br />Separate operating system and application data<br />OS volumes (C: or /) on a different VMFS or LUN from applications (D: etc)<br />Heavy apps get their own VMFS or raw LUN(s)<br />Optimize storage by application<br />Consider different tiers or RAID levels for OS, data, transaction logs - automated tiering can help<br />No more than one VMFS per LUN<br />Less than 16 production ESX .VMDKs per VMFS<br />Get thin<br />Deduplication can have a huge impact on VMDKs created from a template!<br />Thin provisioning can be very useful – Thin disk is in Server, not ESX!?!<br />
    110. 110. Why NPIV Matters<br />Without NPIV<br />N_Port ID Virtualization (NPIV) gives each server a unique WWN<br />Easier to move and clone* virtual servers <br />Better handling of fabric login<br />Virtual servers can have their own LUNs, QoS, and zoning<br />Just like a real server!<br />When looking at NPIV, consider:<br />How many virtual WWNs does it support? T11 spec says “up to 256”<br />OS, virtualization software, HBA, FC switch, and array support and licensing<br />Can’t upgrade some old hardware for NPIV, especially HBAs<br />Virtual Server<br />Virtual Server<br />Virtual Server<br />21:00:00:e0:8b:05:05:04<br />With NPIV<br />Virtual Server<br />Virtual Server<br />Virtual Server<br />…05:05:05<br />…05:05:06<br />…05:05:07<br />
    111. 111. Virtualization-Enabled Disaster Recovery<br />DR is a prime beneficiary of server and storage virtualization<br />Fewer remote machines idling<br />No need for identical equipment<br />Quicker recovery (RTO) through preparation and automation<br />Who’s doing it?<br />26% are replicating server images, an additional 39% plan to (ESG 2008)<br />Half have never used replication before (ESG 2008)<br />News: VMware Site Recovery Manager (SRM) integrates storage replication with DR<br />
    112. 112. Enhancing Virtual Servers with Storage Virtualization<br />Mobility of server and storage images enhances load balancing, availability, and maintenance<br />SAN and NAS arrays can snap and replicate server images<br />VMotion moves the server, Storage VMotion (new in 3.5) moves the storage between shared storage locations<br />Virtualization-optimized storage<br />Pillar and HDS claim to tweak allocation per VM<br />Many vendors announcing compatibility with VMware SRM<br />Most new arrays are NPIV-capable<br />Virtual storage appliances<br />LeftHand VSA – A virtual virtualized storage array<br />FalconStor CDP – a virtual CDP system<br />
    113. 113. Enabling Virtual Backup<br />Virtual servers cause havoc for traditional client/server backups<br />I/O crunch as schedules kick off – load is consolidated instead of balanced<br />Difficult to manage and administer (or even comprehend!)<br />Storage virtualization can help<br />Add disk to handle the load (VTL)<br />Switch to alternative mechanisms (snapshots, CDP)<br />Consider VMware consolidated backup (VCB)<br />Snapshot-based backup of shared VMware storage<br />Block-based backup of all VMDKs on a physical server<br />
    114. 114. Audience Response<br />Question?<br />
    115. 115. Break sponsored by<br />
    116. 116. Part 3:Should You Virtualize?<br />A look at the practical benefits of virtualized storage<br />
    117. 117. Agenda<br /><ul><li>Pooling for efficiency, flexibility, and scalability
    118. 118. Performance
    119. 119. Stability, availability, and recoverability
    120. 120. The down side
    121. 121. Cost benefit analysis
    122. 122. Where will you virtualize?</li></li></ul><li>Pooling:Flexibility, and Scalability<br /><ul><li>Effective allocation of resources
    123. 123. The right amount of storage for the application
    124. 124. The right type (tiered storage)
    125. 125. Quickly add and remove on demand
    126. 126. Move storage from device to another
    127. 127. Tiering, expansion, retirement
    128. 128. Larger systems have fewer capacity limitations</li></li></ul><li>How Green Am I?<br />Server virtualization can dramatically reduce power, cooling, and space requirements<br />Fewer physical servers<br />Better (any) power management<br />Storage virtualization offers fewer green benefits<br />Does not normally reduce equipment footprint<br />Enterprise storage systems not very energy efficient<br />Transformed storage systems might help<br />De-duplication, tiered storage, and archiving can slow growth<br />New MAID and spin-down devices offer power/cooling savings<br />
    129. 129. Performance<br />A battle royale between in- and out-of-band!<br />In-band virtualization can improve performance with caching<br />Out-of-band stays out of the way, relying on caching at the device level<br />Split-path adds scalability to in-band<br />Large arrays perform better (usually) than lots of tiny RAIDs or disks<br />First rule of performance: Spindles<br />Second rule of performance: Cache<br />Third rule of performance: I/O Bottlenecks<br />
    130. 130. Solid State Drives (and Myths)<br />The new (old) buzz<br />RAM vs. NAND flash vs. disk<br />EMC added flash drives to the DMX (CX?) as “tier-0”, CEO Joe Tucci claims flash will displace high-end disk after 2010<br />Sun, HP adding flash to the server as a cache<br />Gear6 caches NAS with RAM<br />But…<br />Are they reliable?<br />Do they really perform that well?<br />Will you be able to use them?<br />Is the 10x-30x cost justified?<br />Do they really save power?<br />Notes: 1 – No one writes this fast 24x7<br /> 2 – Manufacturers claim 2x to 10x better endurance<br />
    131. 131. Stability, Availability, and Recoverability<br />Replication creates copies of storage in other locations<br />Local replicas (mirrors and snapshots) are usually frequent and focused on restoring data in daily use<br />Remote replicas are used to recover from disasters<br />Virtualization can ease replication<br />Single point of configuration and monitoring<br />Can support different hardware at each location<br />
    132. 132. We Love It!<br />Efficiency, scalability, performance, availability, recoverability, etc…<br />Without virtualization, none of this can happen!<br />
    133. 133. The Down Side<br /><ul><li>Consolidation and centralization creates bigger baskets for your precious data
    134. 134. Downtime and performance affect more systems
    135. 135. Harder to back out if unsatisfied
    136. 136. Additional complexity and interoperability concerns
    137. 137. Scalability issues - ever-bigger systems</li></li></ul><li>Implementation Issues<br />Many virtualization systems require additional software loaded on servers<br />Device drivers, path managers, agents, “shims”<br />Additional maintenance and configuration can offset “single pane” benefits<br />Organizational issues can crop up<br />Virtualization blurs the lines between who owns what<br />Future datacenter combines server, storage, network<br />What about application?<br />
    138. 138. Cost Benefit Analysis<br />Benefits<br />Improved utilization<br />Tiering lowers per-GB cost<br />Reduced need for proprietary technologies<br />Potential reduction of administrative/ staffing costs<br />Flexibility boosts IT response time<br />Performance boosts operational efficiency<br />Costs<br />Additional hardware and software cost<br />Added complexity, vendors<br />Training and daily management<br />Reporting and incomprehensibility<br />Possible negative performance impact<br />Stability and reliability concerns<br />
    139. 139. Where Will You Virtualize?<br />
    140. 140. Closing Thought:What Is Virtualization Good For?<br />Virtualization is a technology not a product<br />What will you get from using it?<br />Better DR?<br />Improved service levels and availability?<br />Better performance?<br />Shortened provisioning time?<br />The cost must be justified based on business benefit, not cool technology<br />
    141. 141. Audience Response<br />Questions?<br />Stephen Foskett<br />Contoural, Inc.<br /><br /><br />