PHD Virtual: Optimizing Backups for Any Storage


Published on

Learn about the differences between virtual full, and traditional full and incremental backup modes, and which mode works best depending on the type of storage.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Virtualization is the standard for powering and managing ITAnd, yet, for all its popularity – managing virtual environments presents numerous challenges because many IT professionals are using complex, hard to use legacy tools many of which were not designed for the virtual worldPHD Virtual is focused on challenging the status quo in virtualization management. We know that work is complicated enough – and the technology you use to get the job done shouldn’t be. That’s why were delivering data protection and Infrastructure monitoring software that’s more effective, far easier to use and much more affordable than our competitionIf you talk to our customers they will tell you they get “rock-star” treatment from us – before, during and after the sale. They love our technology because - in their words, they can just set-it-and-forget-it because it just works!We call it unmatched value for your virtual world – And, we hope that after this short presentation you’ll know why we provide the absolute best value for your business
  • Lets get started by helping you understand a little more about our company…PHD Virtual provides the absolute best value in virtual backup and replication for VMware and Citrix platforms as well as offers an advanced Infrastructure monitoring solution for physical, virtual and cloud environments. More than 4,500 customers worldwide rely on our products (our customers are a who’s who among the Fortune 500 and we serve 1000s of mid-market companies too)Thanks to the amazing reception we have had from customers we have had 8 consecutive quarters of record revenue growth – This makes us one of the fastest growing software companies around the globeAnd, as you can imagine, our company has been recognized as standing out from our peers by receiving numerous awards as well as a direct investment by Citrix
  • Have you ever sat through a presentation and struggled to understand how that company is different from competitive alternatives? We believe its very important that we help you understand what sets PHD Virtual apart.Our approach for delivering unmatched value can be summed up in 3 simple differentiators:We deliver effective data protection and infrastructure monitoring products that are powerful, scalable and deliver value in minutesOur products are exceptionally easy to use – you’ll never need to read a manual, take a training course and you can install and be fully configured in less than 10 minutes – what Apple is to consumer electronics – we are to virtualization managementOur products are priced at a fraction of the cost of competitive alternatives – great technology shouldn’t be so expensiveIts why our customers represent our ultimate fan club… read quote
  • Its also why we’ve experienced amazing growth – and we fully expect the growth to continue
  • Lets turn to our award winning products and start with our Data Protection product called PHD Virtual Backup and ReplicationPHD Virtual Backup and Replicationprovides advanced data protection powered by our unique Virtual Backup Appliance architecture. It is the #1 Virtual Backup solution and supports both VMware and Citrix.Let me start by covering few key items before we give you an overview of specific capabilities and how it compares against competitorsLike all PHD products, its super easy to use, it installs and configures in minutes, requires no training and you can literally set-it- and forget itFor most customers its all about time to recovery and you’ll find no one is faster – and no product performs faster – We know that backup windows are short – so you need a product that moves at the pace of your businessYou’ll find we use significantly less storage than any other alternative because of our powerful source-side de-duplication capabilities and incremental back-ups which store only changesOne of the key differentiators of our product is our Virtual Backup Appliance architecture. This patented capability is our secret sauceThe VBA plugs directly into the hypervisor Offers the most effective virtual backup at the lowest cost – because our unique VBA architecture requires no physical machines or 3rd party software - lets take a look at how it in more detail…
  • First, I just wanted to touch on a few points that by now are becoming pretty well-known in the industry. You constantly hear about virtualization growth, VM sprawl, and the overall proliferation of VM data. You’re starting to see shifts where mission critical workloads are being virtualized more and more, which ups the ante on service levels, RTOs and RPOs. These things can wreak havoc on an admin because they’re creating situations of MUCH higher demand, yet budgets aren’t growing as fast as these demands are growing, and you don’t all of the sudden have tons of time freed up to spend redesigning your backup strategy if things go wrong.This is why backup needs to be in the forefront of any virtualization project. The process of moving and storing large amounts of data can be very intensive, even with a solution like PHD Virtual that provides a number of efficiencies. The good news is that there are ways to leverage a cost-effective, easy to use product, yet still optimize performance to meet these new demands.
  • So as I mentioned just before…backup is an resource (CPU/Disk/Network) intensive process. In many cases, it can top the chart of application consumers in the data center, even with its ability to leverage nice features, such as source-side dedupe, compression, and change block tracking.During your backup window all solutions have 4 fundamental components to be considered: The IOPS to perform reads on the source, or Production, storage (READ)compression / deduplication (PROCESING)The reads and writes that occur on the Backup Storage (WRITE).And finally, the connectivity used in your Storage architecture. This covers the overall TRANSPORT of data, and how it will flow from production to backup and the protocols the path will use.The write side also induces a non-trivial READ load on the backup storage, primarily for file system meta data access, and for some storage solutions that employ a lot of what we’ll call SMART logic, an example would be a Data Domain that performs in-line deduplication and compression, this can cause performance issues. And it’s mainly due to the fact that a mix of high READ & WRITE IOPS is just not in their performance sweet spot. This particular issue has a greater impact proportional to the percentage of used space. The more data stored, the more intensive that process can be on those SMART storage subsystems.
  • Now that we have discussed backup as a process, let’s take a quick look at some of the variables in the environment that can effect your backup speed and efficiency.
  • A lot of backup software now includes replication capabilities. And the replication process can be a heavy READ load on the primary site. So if you’re leveraging replication it’s important to understand how to optimize different aspects of your data protection strategy.When you think about VM replication at its core, you’re talking about moving data from a source to a target, similar to backup. The data just looks a bit different at the target when the process is done. What you can do to optimize a strategy where you’re doing both backup and replication, you can look for a solution that leverages the backup storage as the replication source. This lets you eliminate the double hit on production storage resources (Read & Snaps). PHD understands this situation well and, therefore, replicates from backup storage, this means a single snapshot on the production VMs and one read pass on the production disk storage is required to backup and replicate. Other solutions require two snapshots and two read passes on the production disk storage.Separate Backup and replication jobs. Running concurrent backup & replication jobs will typically have a greater impact on the infrastructure than the sum of backup & replication (resource contention, etc.)
  • I’ll touch very quickly on the network here because I just want to point out a couple basic points.One obvious point is that, depending on how your datastores and backup targets are configured, your backup window can have a big dependency on the size of your network connection, as well as the contention it gets from other traffic. That’s kind of a given, so it’s always best to work to limit contention wherever you can.Therefore, in most cases, you should look for a backup solution that will move the least amount of data across the network. For Example: PHD leverages TrueDedupe, VMware CBT, and a VirtualFull backup mode that I’ll talk about in more detail later that has a true forever incremental benefit that ensures minimal impact to your network.Also, in general you want to minimize the number of times data traverses a virtual network interface. Configuring your backup storage as an attached disk pushes data through the virtual IO stack instead of over the virtual network. This can sometimes give you a nice performance boost. So think about that when considering using an NFS share as your backup target. You can actually configure it as an attached disk instead by first configuring the share as a datastore, instead of pushing data directly to the share.
  • This transitions nicely to the next variable, which is…the backup storage protocol. You really have two main protocols here – file level and block level. File level protocols require the backup application to interact with a network file system (NFS, CIFS) which in turn interacts with a storage file system (ext3, ext4, NTFS, etc.).Because of its distributed and shared nature, the network file system layer uses a significant amount of overhead to ensure data consistency, for example opening an existing file on a CIFS share results in 6 network transactions. So if the backup application uses lots of small files to enhance deduplication and efficiency, this can have a negative effect on overall performance. Therefore, file level protocols are not generally optimized for heavy random reads and writes, and more often than not CIFS exhibits worse performance than NFS. There is some good support for this when you look at the fact that two of the major hypervisors on the market – vSphere and XenServer – do not support CIFS for primary VM storage.
  • Now the block level protocol does things very differently.Allows the application to select (create) the file system that best meets its needsSupports efficient cachingProvides fastest access to the data (and the meta data)Provides increased reliabilityCompared to the file level protocol, that intermediate (NFS, CIFS) file system is removed from the access path, and this can have a huge benefit for performance for backup, replication, and recovery. The main configurations supporting block level protocols are local storage on the virtual host, fibre channel, or iSCSI.
  • As an add-on to the protocol discussion, how the storage is architected and connected to the environment can also impact efficiency and performance. In general Local Storage for backup is the optimal model just because it reduces software overhead.But as I mentioned earlier, you can even optimize NFS by giving the share to the hypervisor to optimize overall NFS traffic to the physical store, something hypervisors have gotten very good at. I mentioned earlier, we have a few real case studies of users getting HUGE benefits by making the switch from direct NFS backup storage to leverage the same exact target as hypervisor connected storage that we used as an attached disk. Now you might be thinking, well attached disk has a 2TB maximum, so this wouldn’t be very scalable for a virtual appliance backup solution. That’s usually true for most virtual appliances, but PHD Virtual leverages what we call Virtual Storage Pools, which allows you to connect multiple attached virtual disks to our VBA and we pool it together as one large backup target under the covers. Those same case studies where we saw 5x performance improvements were using this technology, which illustrates that the software overhead of the pool is minimal.
  • And a couple other quick points for storage optimization that are worth mentioning…In order to maximize the write speed of backups, it always helps to configure your LUNs across multiple spindles allows for higher aggregate throughput (parallel operation).And when you’re considering different RAID configurations, you should configure for the best WRITE performance. For example, RAID10 provides better write performance than RAID 5.In summary, when considering storage architectures in your virtual environment, it’s best to think from the perspective of what’s best for the application (backup for this discussion) and then make choices (protocol, location, spindles, etc.) down the stack based on what options are available at each point. Often thinking from the storage up (or in very general terms) will result in sub-optimal performance as compared to a different architecture that may have been possible.
  • Up til now, I’ve spent quite a bit of time talking about how you can optimally configure your environment. I’ve also sprinkled in a few best practices and nice features about PHD Virtual to help you along that journey. But the fact of the matter is that you may not always be able to reconfigure an existing environment to be the most ideal. I mentioned earlier that PHD provides multiple backup modes to help you achieve the best performance possible and make the necessary tradeoffs to meet your data protection goals around ease of use, speed, and efficiency.
  • Let’s go into a bit of detail about those backup modes and what storage targets fit each mode…Our virtual full backup is pretty unique to PHD and provides some great efficiencies with block-level deduplication across all VMs on the target, as well as technology that prevents users from ever needing to run another full backup. The end result is that you save money on storage and your backup windows are fast and predictable. Which is great, but as we’ve discussed at length, one method doesn’t fit all environments. The Virtual Full mode uses lots of very small files to store the blocks of data independently of each other. This is a core part of the storage efficiency and speedy VM recovery benefits you get from PHD. However, as we discussed earlier, lots of small files can be inefficient for certain storage targets.Therefore, we’ve added that traditional full and incremental mode that uses large backup files, limiting the amount of file processing that occurs. This backup mode has less storage efficiency built into because it does require periodic full backups to be taken. However, this still may be the optimal mode you want to use for data protection strategy, and I’ll go into a few of those use cases now.
  • Long-term retention:One of the really cool things that you get with virtual full backups is flexible, efficient long-term retention. Because each backup only stores unique data and there is never a need for another full backup to be run after the first time a VM is processed, we can allow for retention policies that store backups from years ago, without having to write special full backups or manage full/incremental chains of backup files. This allows for an automated, efficient thin the heard concept that a lot of our customers love because it mitigates their need for tape and minimizes the time they spend managing long-term data.
  • Local Attached Disk on hypervisor storage using a block-level storage protocol has proven to be the highest performing and most reliable backup data store. An example would be one or more local attached virtual disks connected to the PHD VBA and created on a hypervisor Disk/LUN.
  • With the advent of deduplicating hardware appliances, cloud gateway devices, and the use of network storage and off-site backup processes, there are a number of occasions where a more traditional full and incremental model will perform more optimally. The reason is that those storage targets will perform better with a process that opens just a few backup files that are much larger in size and streams data into them, mitigating some of the inefficiencies that go along with file level protocols that we discussed before.Additionally, tape also likes larger files. If you’re taking data off to tape infrequently, then you can still use the virtual full mode because we have a cool feature call the PHD Exporter. The Exporter will rehydrate virtual full backups – which consist of many small files – into a compress OVF images that can be swept to tape and recovered from tape without ever needing PHD software. This process is a bit expensive at times, so if you have a lot of data that needs to go to tape, it is not always the best choice. As an alternative, you can use our full/incremental mode to limit the amount of data you need to take to tape each night. Also, it is important to note that PHD VBAs are flexible enough to backup VMs using both modes to meet whatever your data protection strategy might be.
  • PHD Virtual: Optimizing Backups for Any Storage

    1. 1. Optimizing Backups for any Storage
    2. 2. Agenda • Corporate Overview • Why virtualization backup demands your attention • Characteristics of backup and the variables that affect its performance • PHD Strategies for Optimizing your Storage for Backup and Recovery • Q&A
    3. 3. PHD Vision Virtual data protection and ensuring availability of physical, virtual and cloud environments should be easier IT professionals who use our products should be excited about them! That’s why we put our users and their data first in everything we do, to deliver: • Powerful virtualization management software to make your job easier • Rock star treatment • Exceptional value Customers call our products amazing because they can: “set-it & forget-it and it just works” We call it “unmatched value for your virtual world”
    4. 4. Who We Are Provider of virtualization backup and monitoring of physical, virtual and cloud environments • Products are powerful, easy to use and affordable Experiencing record growth • 9 consecutive quarters of record growth Industry leadership • Only backup company Citrix has invested in • Product of the year: Search Server Virtualization • Reader’s choice award: Virtualization Who We Serve
    5. 5. What Makes PHD Virtual Unique? Unmatched Value: Effective products: • Powerful and scalable • Value evident out-of-the-box Easier to use: • Easy to download, install and use Lower cost: • Priced at a fraction of the cost of competitors Ultimate Fan Club! “I can’t afford to spend much time managing backups. PHD makes the process a piece of cake. How they put so much functionality into such an easy to use product, I don’t know, but they did”. - Marek Friedl, Hardt Equipment “PHD Virtual allows you to easily deploy multiple Virtual Backup Appliances that create unprecedented scalability, without the cost and complexity of other solutions.” - Virtualization Blogger, Roy Mikes
    6. 6. Customer Success Consistent Customer Growth 150 700 300 1200 1500 2100 3500 5000 2005 2006 2007 2008 2009 2010 2011 2012
    7. 7. PHD Virtual Backup & Replication • Supports Citrix XenServer and VMware vSphere • Installs and configures in 10 minutes and requires no training • Performs 3 to 5 times faster on incremental backups • Uses less storage than any competitor • Unique Virtual Backup Appliance that plugs into the Hypervisor; scales quicker and easier • Requires no physical machines, 3rd party OS or software
    8. 8. Why Virtualization Backup Demands More Attention • More Data is being stored on VM’s • Organizations trusting their critical data to virtualization • VM backups have remained an afterthought • Creating new VM’s creates more backup work but your backup window hasn’t changed • Shorter RTO/RPO windows • Desire to decrease admin-hours to manage backups • Organizations struggling to adhere to backup storage policies • Demand for better disaster recovery
    9. 9. Characteristics of a Backup and the variables that affect its performance
    10. 10. Characteristics of a Backup The Backup Process is an intensive process • High number IOPS (reads) on the production virtual disk storage • Data compression is a CPU intensive operation • Dedupe (hashing) is a CPU intensive operation • High number of IOPS (reads & writes) to the backup storage target • Network use, depends on the location of both the production data and the storage target
    11. 11. Variables that affect performance • Replication • Network • Backup Storage Protocols Architectures
    12. 12. Replication • Replication adds an additional “read” load on the primary site Ideal solutions will not impact production storage twice to perform replication • Don’t overlap processes Separate backup and replication jobs execution times
    13. 13. Network • Are production data stores network based? • Is the backup data store on network storage? Leveraging this as attached virtual disk (providing the network storage to the hypervisor) provides best backup performance • Size of the Connection • Competing traffic or processes running at that time
    14. 14. Backup Storage Protocols File level (NFS / CIFS) • Presented as “managed-space”, which includes additional overhead • Are not optimized for high IOPS (heavy random read/write) • Require additional overhead to handle possible concurrent access from multiple sources
    15. 15. Backup Storage Protocols cont’d Block Level • Presents raw block storage to the application • Allows the application to create the optimal file system locally • Block level protocols include: Local disks (SATA, SCSI, etc.) Fibre Channel iSCSI
    16. 16. Backup Storage Architectures Where your backup storage is located can also impact performance • “Local Storage” – hypervisor connected storage (FC, iSCSI, SCSI) • Network storage – NFS or CIFS shares NFS can leverage “block level storage” by giving the share to the hypervisor rather than the backup appliance NFS & CIFS storage directly connected to your backup solution it is treated as file level storage
    17. 17. Backup Storage Architectures cont’d… How your storage is configured can have an affect on your backup performance • Distributed storage, increasing the number of spindles available, increasing the number of parallel operations • On the backup storage, select the best write performance RAID level of those available (RAID 10 is better than RAID 5)
    18. 18. PHD Strategies for Optimizing your Storage for Backup and Recovery
    19. 19. Full and Incremental Backup Mode Full FullInc Inc Inc Inc Inc Inc Inc Inc Inc Inc Full & Incremental Virtual Full Initial V Full V Full V Full V Full V Full V Full V Full V Full V Full V Full V Full V Full
    20. 20. When to Use Virtual Full Backups • Looking for deduplication at the source Saves on storage Saves on network bandwidth • Want long-term retention on disk • Need very predictable backup windows
    21. 21. Virtual Full Storage Model Best Performance Model: Attached disk storage through the hypervisor leveraging block level protocols(Fibre Channel, SCSI, iSCSI, NFS)
    22. 22. When to Use Full / Incremental Backups • You are writing to: Deduplicating storage (ExaGrid, Data Domain, Windows Server 2012) You are going to a cloud gateway or WAN optimizer You are backing up directly to off-site storage CIFS/NFS • You must go to tape/removable media frequently NOTE: You can leverage both modes
    23. 23. Key Takeaways • Don’t forget to optimize your backup storage • Consider competing processes, storage protocols, and network • We have flexible storage options to optimize backup and recovery processes • We support all storage types and have customers using them successfully
    24. 24. 866-710-1882 (USA and Canada) +1-267-298-5320 (International) Q & A