• Like
Top Troubleshooting Tips and Techniques for Citrix XenServer Deployments
Upcoming SlideShare
Loading in...5
×

Top Troubleshooting Tips and Techniques for Citrix XenServer Deployments

  • 25,727 views
Uploaded on

This session will provide an expert insight into the most common issues encountered by Customers, Partners and Support engineers. …

This session will provide an expert insight into the most common issues encountered by Customers, Partners and Support engineers.
It’s a feature packed agenda which gets to the point quickly and concentrates on the issues we encounter continuously with XenServer deployments.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
25,727
On Slideshare
0
From Embeds
0
Number of Embeds
5

Actions

Shares
Downloads
601
Comments
1
Likes
8

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • When troubleshooting XenServer issues, it is important to have an understanding of the XenServer Architecture. Although you can be using XenCenter to work with XenServer hosts, you can also use ssh to connect to Dom0 which is the VM with special privileges which must be running on Each XenServer host. There are a variety of ways to talk to XenServer (various consoles, cloud-management software, PowerShell etc.), in the background we are usually talking directly to the XAPI toolstack. On the Dom0 console, we can use xe commands in order to use the Xen API. We are normally using standard Linux command and tools to talk directly to the Xen hypervisor and directly to the hardware.When XAPI commands are issued, standard Linux configuration files are issued in Dom0 and also through XAPI, device drivers are also called to talk directly to the storage.
  • Generally speaking, weather you are taking about a storage area network or local storage (i.e. the local disk on the XenServer itself), there are 2 types of storage – thin provisioned or thick provisioned.A good analogy of thin provisioned storage is a file divider – it may start with no files in it but as you add more and more files, it starts to expand – its exactly the same in the virtual world – The danger is that you can over-allocate your storageThick provisioning is more analogous to box files (as illustrated in the diagram to the far right). A box file will always take up a certain amount of space regardless of weather there is 1 file in there or 100 file – in the virtual world, its very similar creating a 40GB disk – there will always be 40 GB reserved for the contents even if you add files that are only taking up a fraction of the entire size of the disk. Depending on the type of storage you use, it will be thin or thick.
  • If you want to create a point-in-time backup of a virtual machine, a snapshot should be created on the xe command line or within XenCenter.Our snapshot technology enables backup vendors to interface directly with XenServer allowing in-guest full and incremental backups of VMs.Snapshots are available on all storage platforms through LVHD implementation. Creating and deleting snapshots differs quite substantially depending on weather you are using thin or thick provisioned storage.
  • Assume you created a virtual disk (along with your VM) of 40GB and use up 20GB of storage space for the operating system and various files (represented by ‘A’ in the above diagram)Further assume that thin provisioning is in use.Once we take a snapshot, the following happens.Disk A is deflated to 20GB and will become read-onlyA new disk for writing the data to will be created (initially close to 0 Bytes in size as its thinly provisioned). It only holds VHD header information – Lets call it 0 bytes for simplicity.A snapshot VHD will also be created which will also be close to 0 Bytes in size)The total size of the disk after taking the snapshot is c. 20+0+0 = c. 20GB
  • Assume you created a virtual disk (along with your VM) of 40GB and use up 20GB of storage space for the operating system and various files (represented by ‘A’ in the above diagram)Now assume that thick provisioning is in use (i.e. you are using Fiber Channel, iSCSI or local LVHD (thick provisioning)Once we take a snapshot, the following happens; Disk A is deflated to 20GB and will become read-only A new disk for writing the data to will be created and inflated to 40GBA snapshot VHD will also be created which will be close to 0 Bytes in size , i.e. Just containing VHD header informationThe total size of the disk after taking the snapshot is c. 20+40+0 = c. 60GB
  • If you delete snapshots (e.g. if you delete the VHDs labled ‘C’ and ‘E’ in the diagram to the top left), we should be able to coalesce all of the information from disks B and D back to A if writes have been made to both. Note that the longer the chain, the slower this process is going to be.
  • The command vhd-util scan –m ‘V*’ –l <SR_VG> -p will show a parent VDI on the left and the child VDIs underneath
  • The vhd-util repair command can fix VHD header information as VHD files contain a second copy of the header information as a backup so there is a good possibility that corrupted header information can be fixed using the vhd-util repair command.
  • The amount of disk space taken by snapshots depends on the size of the VDI and the amount of shared data.
  • The key components of Xentools are;XEN – initiaizes and owns the hypercall interfaceXENFILT – uplugs emulated devicesXENBUS – Creates main PV interfaces (e.g. EVTCHN, XENSTORE etc.) and enumerates all PV classes (e.g. VIF, VBD etc.)XENVIF – Network class driver: enumerates VIFsXENNET – Network interface driver / NDIS 6.0 miniportXENVBD – Storage class driver (enumerates VBDs) / Storeport miniport (the scsi filter driver is no longer needed and the storeport miniport driver is much faster)XENIFACE – WMI API to XENSTORE
  • For the XenServer 6.1 product the xentools package has changed significantly.To support Windows XP and Windows Server 2003, the legacy drivers and install package continues to be supported and maintained (i.e. inclusion of bug fixes).For Windows XP and Windows Server 2003, the installer app is now called xenlegacy. The familiar xensetup.exe meta-installer is still available. Due to the new WDK (Windows Driver Model) for Windows 8, the driver set had to be split between the newer drivers (Windows Vista and higher) and the older ones for Windows XP and Windows 2003.
  • In order to make things simpler, a single install wizard is used.
  • Issues Resolved In Hotfix XS61E009 and XS61E0101. Virtual Machines (VMs) with out of date XenServer Tools, may not be flagged as "out of date" in XenCenter. Hotfix XS61E009 resolves this issue and enables customers to be notified in XenCenter when new XenServer Tools are available.2. Booting a Citrix Provisioning Services (PVS) target device using a Boot Device Manager (BDM) image can take an extended time to complete. Hotfix XS61E009 resolves this issue.3. Customers using XenServer Platinum Edition to license Citrix Provisioning Services (PVS) may find that one PVS license per VM is checked out, rather than one PVS license per XenServer host. This may lead to a shortage of PVS licenses and an inability to provision VMs. Hotfix XS61E010 along with CTX135672 - Hotfix CPVS61016 (Version 6.1.16) - For Citrix Provisioning Services 6.1 - English resolves this issue.4. Attempts to shut down Microsoft Windows Vista and later VMs can cause intermittent blue screen errors, with a "STOP: 0x0000009f..." error message. Hotfix XS61E010 resolves this issue.5. Adding more than eight NICs to Microsoft Windows Vista and later VMs, using the xe CLI can lead to a blue screen error on reboot. Hotfix XS61E010 resolves this issue.6. Copying data to a Microsoft Windows 2003 VM can cause the VMs to hang and lead to a grey screen error. Hotfix XS61E010 resolves this issue.7. When Dynamic Memory Control (DMC) is enabled, attempts to migrate Microsoft Windows XP and later VMs using XenMotion can cause the VMs to hang and lead to blue screen error. Hotfix XS61E010 resolves this issue.8. When the Citrix Xen Guest Agent service is running, Cut and Paste will not work between a XenDesktop virtual desktop and the endpoint device. Hotfix XS61E010 resolves this issue.
  • Each of the issues below have also been resolved In Hotfix XS61E009 and XS61E0101. When the Citrix Xen Guest Agent service is running, Cut and Paste will not work between a XenDesktop virtual desktop and the endpoint device. Hotfix XS61E010 resolves this issue.2. Microsoft Windows XP and later VMs may hang during the boot process and may have to be forced to reboot. Hotfix XS61E010 resolves this issue.3. Attempting to install or upgrade the XenServer Tools on Microsoft Windows Vista and later VMs, which do not have access to a paravirtualized or an emulated network device can cause the installation process to hang. Hotfix XS61E010 resolves this issue.4. Manually installing the Legacy XenServer Tools without changing the device_id to 0001 can result in a "STOP: 0x0000007B..."error when rebooting a Windows VM. After installing Hotfix XS61E010, customers will not be able to manually install the Legacy XenServer Hotfix XS61E010 Tools by running xenlegacy.exe. When customers start the XenServer Tools installation process, the installwizard.msi launches automatically.5. Microsoft Volume Shadow Copy Services (VSS) (required for third party backup solutions) was unavailable on Microsoft Windows Server 2008 in the original version of XenServer 6.1.0. After installing Hotfix XS61E010, XenServer 6.1.0 customers will be able to take quiesced snapshots on Microsoft Windows Server 2003 and Windows Server 2008 VMs. For Windows Server 2008 R2, see the guidance that appears earlier in this section.
  • Some workarounds have been followed in relation to known issues (as discussed) – It is very important that the guidance provided in http://support.citrix.com/article/CTX135099is followed in these cases. This article gives a comprehensive overview on how to troubleshoot XenServer tools in general and particularly in relation to XenServer 6.1
  • A really key advantage of the Windows update model is that deployment of bug fixes and new features can now be delivered automatically and independently of any XenServer release. The Windows administrators and XenServer administrators can work independently which is more ideal for cloud based installations
  • At Citrix Services - we’re Citrix consultants, teachers and support engineers and we’re all about one thing: making sure you succeed.With our help, you’ll deploy high-performance, robust virtualization and networking projects, faster – with dramatically lower risk and higher return.The best Citrix architects and administrators are the ones who never stop learning – and Citrix Education is here to help you learn those skills.Citrix Consulting gives you direct access to our most experienced virtualization and networking experts.When it’s complex; when it’s mission-critical; when it’s big; That’s when Citrix consultants can really help.On your virtualization journey, you’ll want always-on support from people who really care about your success.There’s no better insurance for your Citrix investment than with Citrix Support.
  • Secrets of the Citrix Support Ninjas is a FREE eBook available next week.The eBook contains 40 insider troubleshooting tips for administrators.So the purpose of the eBook is to help administrators like you keep your Citrix deployments on track.We’ve collected some of their best tips and tricks for running robust Citrix environments and packaged them up into a free eBook.In it, you’ll discover some of the little-known tricks that our own support people use every day to tune, tweak, troubleshoot and test Citrix solutions. You may know a few of these tips. But you probably don’t know them all.And – you never know – you might discover just one that will change your life as an administrator.Let me give you a sneak peak now.
  • "Sometimes setting up a NetScaler may seem an impossible. What do you do when you hit a roadblock at an early stage?In this session, Ronan will step through the most common issues you can experience at an early stage of your NetScaler deployment, and how to diagnose them using on-box tools.During this session you will learn:- Troubleshooting basic setup issues- Health Checks- Introducing redundancy and removing single point of failure- Logging – what happened historically"
  • Please note that some images used in this presentation are c. Presenters Media, 2013 (All rights reserved)

Transcript

  • 1. June 27, 2013 Citrix Support Secrets Webinar Series Top Troubleshooting Tips and Techniques for Citrix XenServer Deployments Mark Butterly, Senior Readiness Specialist Giovanni Di Tizio, Technical Relationship Manager Mark Butterly Giovanni Di Tizio
  • 2. Introduction • Storage Management and Snapshot Secrets ᵒHow it works - How space is utilized • The „VDI not available‟ problem • Hotfixes and Driver installation • XenTools / XenTools installation • Q+A Tips, Tricks + Techniques
  • 3. Storage Management and Snapshot Secrets
  • 4. Control Domain (Dom 0) XenServer Architecture Xen Hypervisor Hardware Virtual Machines Local Storage Network Card Remote or SAN based Storage SSH Linux XAPI
  • 5. Thin vs. Thick Provisioning
  • 6. Snapshots • Disk and Memory ! • Enables Backup vendors to interface directly with XenServer • Snapshots available on all storage platforms • Thin vs Thick Provisioning
  • 7. 400 Snapshot (NFS and EXT Local Storage) • Resulting VDI tree • Disk utilization ᵒVHD files thin provisioned ᵒVDI A contains writes up to point of snapshot ᵒVDI B and C are empty* ᵒTotal: • VDI A: 20 • VDI B: 0* • VDI C: 0* ᵒSnapshot requires no space* A B 20 40 40 C 0 (1)(2) (1) Size of VDI (2) Data written in VDI Key Snapshot CloneParent Active* Plus VHD headers
  • 8. Snapshot (Local LVHD, iSCSI or FC SR) • Resulting VDI tree • Disk utilization ᵒVolumes are thick provisioned ᵒDeflated where possible ᵒTotal: • VDI A: 20 • VDI B: 40* • VDI C: 0* ᵒSnapshot requires 40 + 20GB A 4020 400 B 40 C 0 (3) (1)(2) (1) Size of VDI (2) Data written in VDI (3) Inflated / deflated state Key Snapshot CloneParent Active * Plus VHD headers
  • 9. Automated Coalescing Example 1) VM with two snapshots, C and E A CB D E A + B 3) Parent B is no longer required and will be coalesced into A D E Key Snapshot CloneParent Active 2) When snapshot C is deleted… A B D E http://support.citrix.com/article/CTX122978
  • 10. Snapshot - LVHD based SR example • Space cost = existing data on disk + VDI size • Empty VDI size on disk = 8.00M • Example: vhd-util scan -m 'V*' -l <SR_VG> -p vhd-util check -n <LV_PATH> LV needs to be: available (LV active). Error 2, LV not active. LV read/write or read only - check gives different output
  • 11. VHD Repair Example: vhd-util repair -n <LV_PATH> Repair command can fix VHD cookies and other VHD header elements only if backup header is present and valid. Repair is unable to fix VHD content, data.
  • 12. VHD Chains • VDI chain growth • VDI chain hits length limit = 30 ᵒError code: SR_BACKEND_FAILURE_109 Error parameters: , The snapshot chain is too long • Trigger coalescing • VHD format introduces overhead on IO ᵒwrite overhead added by VHD format • Read overhead of VHD format multiplied by chain length
  • 13. Key tips for handling Snapshots • For "golden image“: ᵒCreate a new VM and convert to template ᵒDo not create it based on a snapshot • Full vdi-copy is the only way to reduce VHD chain to 0 • NEVER EVER MANUALLY REMOVE ANY VDI(s) FROM A CHAIN!! • Manually reclaim disk space: http://support.citrix.com/article/CTX123400 • Don’t use Snapshots as a backup. ᵒBad use cases – Databases, MS Exchange,
  • 14. VDI not Available
  • 15. VDI not available… It may happen for different reasons, this are the 2 more common: • The VDI is actually missing (maybe deleted by mistake) • The VDI is locked to a host where the VM was running.
  • 16. VDI is actually missing… In this example we can see that our VM “Linux03” has 2 VDIs assigned:
  • 17. VDI is actually missing… Here we can see further details about the VDIs:
  • 18. VDI is actually missing… If might happen that someone (by mistake of course) deletes one of the VDIs (in this example Test03_0): Note: Renaming the LV will have the same effect.
  • 19. VDI is actually missing… When you try to start the VM you get on XenCenter: But if you try from the CLI we can see further information:
  • 20. VDI is actually missing… If you check on SMlog you can see:
  • 21. VDI is actually missing… Summary • In this case the only way to recover would be to restore the missing VDI. • Is not uncommon on NFS scenarios that files might be removed, very unlikely in a LVS scenario. • Starting the VM from the CLI provides a quick way of knowing what is the problem.
  • 22. The VDI is locked to a host where the VM was running… To prevent data corruption XAPI keeps a lock on the VDI of the VMs in use. The lock is indicated by: host_OpaqueRef, under the parameter: sm-config of the VDI. In this example we are going to use a VM called Linux01 which is already running:
  • 23. The VDI is locked to a host where the VM was running… In this example we are going to use a VM called Linux01 which is already running:
  • 24. The VDI is locked to a host where the VM was running… When we see the details of one the VDIs the “host_OpaqueRef” setting is indeed there:
  • 25. The VDI is locked to a host where the VM was running… If we dig into XAPI we can see that it refers to a host: In this case “xs-lab02-giovad”
  • 26. The VDI is locked to a host where the VM was running… When the VM is not running that parameter is cleared from XAPI
  • 27. The VDI is locked to a host where the VM was running… So what happens if the parameter is stale in XAPI and the VM is no longer running: This could happen if the host were the VM is running dies or becomes unresponsive.
  • 28. The VDI is locked to a host where the VM was running… When you try to start the VM in XenCenter you will see: While in the CLI you will see instead:
  • 29. The VDI is locked to a host where the VM was running… In /var/log/SMlog you will see:
  • 30. The VDI is locked to a host where the VM was running… Troubleshooting: • Try to start the VM from the CLI, this will provide further details about the error, in this case “The VDI is not available… Already attached RW” • The error showed on the CLI or SMlog will also indicate which is the problematic VDI. • Further confirmation can be gathered by checking for the “host_OpaqueRef” in the sm_config parameter of the VDI. • Make sure the VM is not running and the power status is actually wrongly set in XAPI. Run list_domains in all the hosts of the pool and grep for the uuid of the VM.
  • 31. The VDI is locked to a host where the VM was running… How to Recover: • We need to clear host_OpaqueRef but the parameter is RO • Basically there are 2 options: 1. Manually modify XAPI‟s database (VERY DANGEROUS and Completely Unsupported) 2. Forget the VDI and add it again will clear the host_OpaqueRef parameter.
  • 32. The VDI is locked to a host where the VM was running… How to Recover: • Once the VDI has been forgotten we do a SR scan to add it back • It will show with no name and no description, we should rename it accordingly using the properties option in XenCenter
  • 33. The VDI is locked to a host where the VM was running… How to Recover: • We then re-attach the disk to the VM
  • 34. The VDI is locked to a host where the VM was running… How to Recover: • We then re-attach the disk to the VM • And if we check for host_OpaqueRef we‟ll that is gone…
  • 35. The VDI is locked to a host where the VM was running… How to Recover: • At this stage we can start the VM again
  • 36. The VDI is locked to a host where the VM was running… Tips: • If you have many VDIs which are the same size it will be hard to know which one is the correct one if you forget many at once, so better do one at the time. • The order in which you attach the VDI is important so this also needs to be done in order. • In XenServer 6.1 we provide a script that does this automatically for all the VMs in the missing host as part of the procedure to recover from missing members of a pool, see: http://docs.vmd.citrix.com/XenServer/6.1.0/1.0/en_gb/reference.html#pool_failures
  • 37. Hotfixes and driver installation
  • 38. Hotfixes and driver installation • XenServer uses industry-standard open source device drivers. • Is not possible for Citrix to test every piece of hardware available so we rely on the vendors for the testing and certification. • Driver are updated on a regular basis to fix bugs or improve performance, however a driver must always match the firmware of the device. Installing a newer driver on a very old firmware will cause more problems than solution the same happens the other way around.
  • 39. Hotfixes and driver installation My device doesn‟t work anymore after applying a HotFix!! • Some hotfixes provide a new kernel, therefore we need to install the driver that matches the new kernel • For example Hotfix XS602E021 - For XenServer 6.0.2 provides a new kernel:
  • 40. Hotfixes and driver installation My device doesn‟t work anymore after applying a HotFix!! • Therefore we need to install the drivers compiled for that kernel version:
  • 41. Hotfixes and driver installation How do I know which driver/firmware am I running? • In the case of network interfaces “ethtool -i” will help: • If you know the name of the driver you could also use modinfo, for example: # ethtool -i eth0 driver: bnx2 version: 2.2.1j firmware-version: 7.2.20 bc 5.2.3 NCSI 2.0.11 bus-info: 0000:01:00.0
  • 42. Hotfixes and driver installation How do I know which driver/firmware am I running? • For HBAs drivers you will need to check on kern.log during boot time, for example: • Or use modinfo if you know the driver involved:
  • 43. Hotfixes and driver installation How do I know which driver/firmware am I running? • For HBAs firmware you will need to check on the HBA BIOS during boot time, the way to access it changes from vendor to vendor so check with them if in doubt. • Qlogic HBA load the firmware when they load the driver so unless is running a very old firmware (previous to version 4.x) there‟s no need to verify the version.
  • 44. Xenserver Tools (aka. XenTools)
  • 45. Why XenServer Tools (aka XenTools)? • Optimize your VMs !!! • Provides high performance drivers and a management agent • Always install after a new VM to make it optimized
  • 46. Legacy support for drivers (XenServer 6.1) • For XP & Windows Server 2003:
  • 47. To make things simpler …. • For All Versions of Windows: • Install Wizard provided • .Net 3.5 or above is required • 2 parts/phases to install • MetaInstaller needs to be installed first • Drivers are then installed
  • 48. Troubleshooting … • If the install wizard installer goes wrong, run it again with the argument • /log logfile.txt • If it still goes wrong, look in • C:programdatacitrixInstallWizard* •Logs the driver and agent msi install logs •Install.txt has other (verbose) install message • The old installer has install.log in c:program filesxentools as before • Watch the Antivirius
  • 49. Issues with XenTools (XenServer 6.1) • VMs with out of date XenServer tools may not be shown in XenCenter • Slow boots for a PVS target device using a Boot Device Manager • PVS licensing issues • Intermittent blue screen errors when shutting down MS Vista VMs • Adding more than eight NICs to MS Vista causing blue screens • Copying data to a MS Windows 2003 VM causes the VMs to hang / grey screen • When Dynamic Memory Control (DMC) is enabled, using XenMotion causes VMs to hang and blue screen.
  • 50. Issues with XenTools (XenServer 6.1) • Cut and paste issues between a XenDesktop VDA and an endpoint (when running Citrix Xen Guest agent is running) • Windows XP and later VMs intermittently hanging during the boot process • Installation hangs when installing tools on Vista / later VMs without access to PV / emulated network device • Manually installing legacy tools without changing a device id to 0001 can result in a blue screen • VSS (required for third party backup solutions) is now available for use with XenServer 6.1
  • 51. • Apply the following Hotfixes …… Resolving issues with XenTools XS61E009 XS61E010
  • 52. XenTools Troubleshooting http://support.citrix.com/article/CTX135099 • Consult the following …… particularly if you have applied workarounds to fix the issues discussed.
  • 53. • Drivers model restructuring allows independent upgrades • Deployment of bug fixes and new features out of band • No need for ISO injection • Updates can be installed automatically at domain administrator‟s discretion • WSUS deployments not yet available but will be made available Future Windows Update Support
  • 54. About Citrix Services Citrix Services make sure you succeed with your virtualization programs. How we can help Citrix Education – The fastest, most efficient way to get your team the virtualization skills they need. Online, on-site or in class. citrix.com/training Citrix Consulting – Intensive engagements for complex, critical or just plain massive projects. citrix.com/consulting Citrix Support – Always-on support services that leverage everything we know about best-practice deployment and maintenance. citrix.com/support Educate | Guide | Support | Succeed
  • 55. • 40 insider troubleshooting tips • Covering XenDesktop, XenServer, XenApp and NetScaler • Citrix Support top engineers • FREE eBook • Citrix Auto Support • Now available! Secrets of the Citrix Support Ninjas
  • 56. Premier Support Calculator Check it out
  • 57. Next Webinar: July • Title: Troubleshooting XenApp with the Citrix Diagnostic Toolkit • Description: When problems occur, support engineers need data points, debug tracing and context information to help determine root causes. Preparation and organization of commonly used tools has always been a time- consuming challenge, especially during outages. The Citrix diagnostics toolkit (CDT) addresses these challenges by rapidly deploying a suite of tools and options in an easy-to-use structured format. • When: July 25th • Registration Now!
  • 58. Work better. Live better.