This session will provide an expert insight into the most common issues encountered by Customers, Partners and Support engineers.
It’s a feature packed agenda which gets to the point quickly and concentrates on the issues we encounter continuously with XenServer deployments.
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Top Troubleshooting Tips and Techniques for Citrix XenServer Deployments
1. June 27, 2013
Citrix Support Secrets
Webinar Series
Top Troubleshooting Tips and Techniques for Citrix
XenServer Deployments
Mark Butterly, Senior Readiness Specialist
Giovanni Di Tizio, Technical Relationship Manager
Mark Butterly Giovanni Di Tizio
2. Introduction
• Storage Management and Snapshot Secrets
ᵒHow it works - How space is utilized
• The „VDI not available‟ problem
• Hotfixes and Driver installation
• XenTools / XenTools installation
• Q+A
Tips, Tricks + Techniques
4. Control Domain
(Dom 0)
XenServer Architecture
Xen Hypervisor
Hardware
Virtual
Machines
Local
Storage
Network
Card
Remote or SAN
based Storage
SSH
Linux
XAPI
6. Snapshots
• Disk and Memory !
• Enables Backup vendors to
interface directly with
XenServer
• Snapshots available on all
storage platforms
• Thin vs Thick Provisioning
7. 400
Snapshot (NFS and EXT Local Storage)
• Resulting VDI tree • Disk utilization
ᵒVHD files thin provisioned
ᵒVDI A contains writes up to point of
snapshot
ᵒVDI B and C are empty*
ᵒTotal:
• VDI A: 20
• VDI B: 0*
• VDI C: 0*
ᵒSnapshot requires no space*
A
B
20 40
40
C 0
(1)(2)
(1) Size of VDI
(2) Data written in VDI
Key
Snapshot CloneParent Active* Plus VHD headers
8. Snapshot (Local LVHD, iSCSI or FC SR)
• Resulting VDI tree • Disk utilization
ᵒVolumes are thick provisioned
ᵒDeflated where possible
ᵒTotal:
• VDI A: 20
• VDI B: 40*
• VDI C: 0*
ᵒSnapshot requires 40 + 20GB
A 4020
400
B 40
C 0
(3) (1)(2)
(1) Size of VDI
(2) Data written in VDI
(3) Inflated / deflated state
Key
Snapshot CloneParent Active
* Plus VHD headers
9. Automated Coalescing Example
1) VM with two
snapshots,
C and E
A
CB
D E
A + B
3) Parent B is no longer required
and will be coalesced into A
D E
Key
Snapshot CloneParent Active
2) When snapshot C is
deleted…
A
B
D E
http://support.citrix.com/article/CTX122978
10. Snapshot - LVHD based SR example
• Space cost = existing data on disk + VDI size
• Empty VDI size on disk = 8.00M
• Example:
vhd-util scan -m 'V*' -l <SR_VG> -p
vhd-util check -n <LV_PATH>
LV needs to be: available (LV active).
Error 2, LV not active.
LV read/write or read only - check gives different output
11. VHD Repair
Example:
vhd-util repair -n <LV_PATH>
Repair command can fix VHD cookies and other VHD header elements only if
backup header is present and valid.
Repair is unable to fix VHD content, data.
12. VHD Chains
• VDI chain growth
• VDI chain hits length limit = 30
ᵒError code: SR_BACKEND_FAILURE_109
Error parameters: , The snapshot chain is too long
• Trigger coalescing
• VHD format introduces overhead on IO
ᵒwrite overhead added by VHD format
• Read overhead of VHD format multiplied by chain length
13. Key tips for handling Snapshots
• For "golden image“:
ᵒCreate a new VM and convert to template
ᵒDo not create it based on a snapshot
• Full vdi-copy is the only way to reduce VHD chain to 0
• NEVER EVER MANUALLY REMOVE ANY VDI(s) FROM A CHAIN!!
• Manually reclaim disk space:
http://support.citrix.com/article/CTX123400
• Don’t use Snapshots as a backup.
ᵒBad use cases – Databases, MS Exchange,
15. VDI not available…
It may happen for different reasons, this are the 2 more common:
• The VDI is actually missing (maybe deleted by mistake)
• The VDI is locked to a host where the VM was running.
16. VDI is actually missing…
In this example we can see that our VM “Linux03” has 2 VDIs assigned:
17. VDI is actually missing…
Here we can see further details about the VDIs:
18. VDI is actually missing…
If might happen that someone (by mistake of course) deletes one of the
VDIs (in this example Test03_0):
Note: Renaming the LV will have the same effect.
19. VDI is actually missing…
When you try to start the VM you get on XenCenter:
But if you try from the CLI we can see further information:
21. VDI is actually missing…
Summary
• In this case the only way to recover would be to restore the missing
VDI.
• Is not uncommon on NFS scenarios that files might be removed, very
unlikely in a LVS scenario.
• Starting the VM from the CLI provides a quick way of knowing what is
the problem.
22. The VDI is locked to a host where the VM was running…
To prevent data corruption XAPI keeps a lock on the VDI of the VMs in
use. The lock is indicated by: host_OpaqueRef, under the parameter:
sm-config of the VDI.
In this example we are going to use a VM called Linux01 which is
already running:
23. The VDI is locked to a host where the VM was running…
In this example we are going to use a VM called Linux01 which is
already running:
24. The VDI is locked to a host where the VM was running…
When we see the details of one the VDIs the “host_OpaqueRef” setting
is indeed there:
25. The VDI is locked to a host where the VM was running…
If we dig into XAPI we can see that it refers to a host:
In this case “xs-lab02-giovad”
26. The VDI is locked to a host where the VM was running…
When the VM is not running that parameter is cleared from XAPI
27. The VDI is locked to a host where the VM was running…
So what happens if the parameter is stale in XAPI and the VM is no
longer running:
This could happen if the host were the VM is running dies or becomes
unresponsive.
28. The VDI is locked to a host where the VM was running…
When you try to start the VM in XenCenter you will see:
While in the CLI you will see instead:
29. The VDI is locked to a host where the VM was running…
In /var/log/SMlog you will see:
30. The VDI is locked to a host where the VM was running…
Troubleshooting:
• Try to start the VM from the CLI, this will provide further details about the
error, in this case “The VDI is not available… Already attached RW”
• The error showed on the CLI or SMlog will also indicate which is the
problematic VDI.
• Further confirmation can be gathered by checking for the “host_OpaqueRef”
in the sm_config parameter of the VDI.
• Make sure the VM is not running and the power status is actually wrongly set
in XAPI. Run list_domains in all the hosts of the pool and grep for the uuid of
the VM.
31. The VDI is locked to a host where the VM was running…
How to Recover:
• We need to clear host_OpaqueRef but the parameter is RO
• Basically there are 2 options:
1. Manually modify XAPI‟s database (VERY DANGEROUS and Completely Unsupported)
2. Forget the VDI and add it again will clear the host_OpaqueRef parameter.
32. The VDI is locked to a host where the VM was running…
How to Recover:
• Once the VDI has been forgotten we do a SR scan to add it back
• It will show with no name and no description, we should rename it accordingly
using the properties option in XenCenter
33. The VDI is locked to a host where the VM was running…
How to Recover:
• We then re-attach the disk to the VM
34. The VDI is locked to a host where the VM was running…
How to Recover:
• We then re-attach the disk to the VM
• And if we check for host_OpaqueRef we‟ll that is gone…
35. The VDI is locked to a host where the VM was running…
How to Recover:
• At this stage we can start the VM again
36. The VDI is locked to a host where the VM was running…
Tips:
• If you have many VDIs which are the same size it will be hard to know which
one is the correct one if you forget many at once, so better do one at the time.
• The order in which you attach the VDI is important so this also needs to be
done in order.
• In XenServer 6.1 we provide a script that does this automatically for all the
VMs in the missing host as part of the procedure to recover from missing
members of a pool, see:
http://docs.vmd.citrix.com/XenServer/6.1.0/1.0/en_gb/reference.html#pool_failures
38. Hotfixes and driver installation
• XenServer uses industry-standard open source device drivers.
• Is not possible for Citrix to test every piece of hardware available so we rely
on the vendors for the testing and certification.
• Driver are updated on a regular basis to fix bugs or improve
performance, however a driver must always match the firmware of the device.
Installing a newer driver on a very old firmware will cause more problems than
solution the same happens the other way around.
39. Hotfixes and driver installation
My device doesn‟t work anymore after applying a HotFix!!
• Some hotfixes provide a new kernel, therefore we need to install the driver that matches
the new kernel
• For example Hotfix XS602E021 - For XenServer 6.0.2 provides a new kernel:
40. Hotfixes and driver installation
My device doesn‟t work anymore after applying a HotFix!!
• Therefore we need to install the drivers compiled for that kernel version:
41. Hotfixes and driver installation
How do I know which driver/firmware am I running?
• In the case of network interfaces “ethtool -i” will help:
• If you know the name of the driver you could also use modinfo, for example:
# ethtool -i eth0
driver: bnx2
version: 2.2.1j
firmware-version: 7.2.20 bc 5.2.3 NCSI 2.0.11
bus-info: 0000:01:00.0
42. Hotfixes and driver installation
How do I know which driver/firmware am I running?
• For HBAs drivers you will need to check on kern.log during boot time, for example:
• Or use modinfo if you know the driver involved:
43. Hotfixes and driver installation
How do I know which driver/firmware am I running?
• For HBAs firmware you will need to check on the HBA BIOS during boot
time, the way to access it changes from vendor to vendor so check with them
if in doubt.
• Qlogic HBA load the firmware when they load the driver so unless is running a
very old firmware (previous to version 4.x) there‟s no need to verify the
version.
45. Why XenServer Tools (aka XenTools)?
• Optimize your VMs !!!
• Provides high performance drivers and a management agent
• Always install after a new VM to make it optimized
46. Legacy support for drivers (XenServer 6.1)
• For XP & Windows Server 2003:
47. To make things simpler ….
• For All Versions of Windows:
• Install Wizard provided
• .Net 3.5 or above is required
• 2 parts/phases to install
• MetaInstaller needs to be installed first
• Drivers are then installed
48. Troubleshooting …
• If the install wizard installer goes wrong, run it again with the argument
• /log logfile.txt
• If it still goes wrong, look in
• C:programdatacitrixInstallWizard*
•Logs the driver and agent msi install logs
•Install.txt has other (verbose) install message
• The old installer has install.log in c:program filesxentools as before
• Watch the Antivirius
49. Issues with XenTools (XenServer 6.1)
• VMs with out of date XenServer tools may not be shown in XenCenter
• Slow boots for a PVS target device using a Boot Device Manager
• PVS licensing issues
• Intermittent blue screen errors when shutting down MS Vista VMs
• Adding more than eight NICs to MS Vista causing blue screens
• Copying data to a MS Windows 2003 VM causes the VMs to hang / grey
screen
• When Dynamic Memory Control (DMC) is enabled, using XenMotion
causes VMs to hang and blue screen.
50. Issues with XenTools (XenServer 6.1)
• Cut and paste issues between a XenDesktop VDA and an endpoint
(when running Citrix Xen Guest agent is running)
• Windows XP and later VMs intermittently hanging during the boot process
• Installation hangs when installing tools on Vista / later VMs without access
to PV / emulated network device
• Manually installing legacy tools without changing a device id to 0001 can
result in a blue screen
• VSS (required for third party backup solutions) is now available for use with
XenServer 6.1
51. • Apply the following Hotfixes ……
Resolving issues with XenTools
XS61E009
XS61E010
53. • Drivers model restructuring allows independent upgrades
• Deployment of bug fixes and new features out of band
• No need for ISO injection
• Updates can be installed automatically at domain administrator‟s
discretion
• WSUS deployments not yet available but will be made available
Future Windows Update Support
54. About
Citrix Services
Citrix Services make sure
you succeed with your
virtualization programs.
How we can help
Citrix Education – The fastest, most efficient way to
get your team the virtualization skills they need. Online,
on-site or in class.
citrix.com/training
Citrix Consulting – Intensive engagements for
complex, critical or just plain massive projects.
citrix.com/consulting
Citrix Support – Always-on support services that
leverage everything we know about best-practice
deployment and maintenance.
citrix.com/support
Educate | Guide | Support | Succeed
55. • 40 insider troubleshooting tips
• Covering XenDesktop, XenServer, XenApp and NetScaler
• Citrix Support top engineers
• FREE eBook
• Citrix Auto Support
• Now available!
Secrets of the Citrix Support Ninjas
57. Next Webinar: July
• Title: Troubleshooting XenApp with the Citrix Diagnostic Toolkit
• Description: When problems occur, support engineers need data
points, debug tracing and context information to help determine root causes.
Preparation and organization of commonly used tools has always been a time-
consuming challenge, especially during outages. The Citrix diagnostics toolkit
(CDT) addresses these challenges by rapidly deploying a suite of tools and
options in an easy-to-use structured format.
• When: July 25th
• Registration Now!
When troubleshooting XenServer issues, it is important to have an understanding of the XenServer Architecture. Although you can be using XenCenter to work with XenServer hosts, you can also use ssh to connect to Dom0 which is the VM with special privileges which must be running on Each XenServer host. There are a variety of ways to talk to XenServer (various consoles, cloud-management software, PowerShell etc.), in the background we are usually talking directly to the XAPI toolstack. On the Dom0 console, we can use xe commands in order to use the Xen API. We are normally using standard Linux command and tools to talk directly to the Xen hypervisor and directly to the hardware.When XAPI commands are issued, standard Linux configuration files are issued in Dom0 and also through XAPI, device drivers are also called to talk directly to the storage.
Generally speaking, weather you are taking about a storage area network or local storage (i.e. the local disk on the XenServer itself), there are 2 types of storage – thin provisioned or thick provisioned.A good analogy of thin provisioned storage is a file divider – it may start with no files in it but as you add more and more files, it starts to expand – its exactly the same in the virtual world – The danger is that you can over-allocate your storageThick provisioning is more analogous to box files (as illustrated in the diagram to the far right). A box file will always take up a certain amount of space regardless of weather there is 1 file in there or 100 file – in the virtual world, its very similar creating a 40GB disk – there will always be 40 GB reserved for the contents even if you add files that are only taking up a fraction of the entire size of the disk. Depending on the type of storage you use, it will be thin or thick.
If you want to create a point-in-time backup of a virtual machine, a snapshot should be created on the xe command line or within XenCenter.Our snapshot technology enables backup vendors to interface directly with XenServer allowing in-guest full and incremental backups of VMs.Snapshots are available on all storage platforms through LVHD implementation. Creating and deleting snapshots differs quite substantially depending on weather you are using thin or thick provisioned storage.
Assume you created a virtual disk (along with your VM) of 40GB and use up 20GB of storage space for the operating system and various files (represented by ‘A’ in the above diagram)Further assume that thin provisioning is in use.Once we take a snapshot, the following happens.Disk A is deflated to 20GB and will become read-onlyA new disk for writing the data to will be created (initially close to 0 Bytes in size as its thinly provisioned). It only holds VHD header information – Lets call it 0 bytes for simplicity.A snapshot VHD will also be created which will also be close to 0 Bytes in size)The total size of the disk after taking the snapshot is c. 20+0+0 = c. 20GB
Assume you created a virtual disk (along with your VM) of 40GB and use up 20GB of storage space for the operating system and various files (represented by ‘A’ in the above diagram)Now assume that thick provisioning is in use (i.e. you are using Fiber Channel, iSCSI or local LVHD (thick provisioning)Once we take a snapshot, the following happens; Disk A is deflated to 20GB and will become read-only A new disk for writing the data to will be created and inflated to 40GBA snapshot VHD will also be created which will be close to 0 Bytes in size , i.e. Just containing VHD header informationThe total size of the disk after taking the snapshot is c. 20+40+0 = c. 60GB
If you delete snapshots (e.g. if you delete the VHDs labled ‘C’ and ‘E’ in the diagram to the top left), we should be able to coalesce all of the information from disks B and D back to A if writes have been made to both. Note that the longer the chain, the slower this process is going to be.
The command vhd-util scan –m ‘V*’ –l <SR_VG> -p will show a parent VDI on the left and the child VDIs underneath
The vhd-util repair command can fix VHD header information as VHD files contain a second copy of the header information as a backup so there is a good possibility that corrupted header information can be fixed using the vhd-util repair command.
The amount of disk space taken by snapshots depends on the size of the VDI and the amount of shared data.
The key components of Xentools are;XEN – initiaizes and owns the hypercall interfaceXENFILT – uplugs emulated devicesXENBUS – Creates main PV interfaces (e.g. EVTCHN, XENSTORE etc.) and enumerates all PV classes (e.g. VIF, VBD etc.)XENVIF – Network class driver: enumerates VIFsXENNET – Network interface driver / NDIS 6.0 miniportXENVBD – Storage class driver (enumerates VBDs) / Storeport miniport (the scsi filter driver is no longer needed and the storeport miniport driver is much faster)XENIFACE – WMI API to XENSTORE
For the XenServer 6.1 product the xentools package has changed significantly.To support Windows XP and Windows Server 2003, the legacy drivers and install package continues to be supported and maintained (i.e. inclusion of bug fixes).For Windows XP and Windows Server 2003, the installer app is now called xenlegacy. The familiar xensetup.exe meta-installer is still available. Due to the new WDK (Windows Driver Model) for Windows 8, the driver set had to be split between the newer drivers (Windows Vista and higher) and the older ones for Windows XP and Windows 2003.
In order to make things simpler, a single install wizard is used.
Issues Resolved In Hotfix XS61E009 and XS61E0101. Virtual Machines (VMs) with out of date XenServer Tools, may not be flagged as "out of date" in XenCenter. Hotfix XS61E009 resolves this issue and enables customers to be notified in XenCenter when new XenServer Tools are available.2. Booting a Citrix Provisioning Services (PVS) target device using a Boot Device Manager (BDM) image can take an extended time to complete. Hotfix XS61E009 resolves this issue.3. Customers using XenServer Platinum Edition to license Citrix Provisioning Services (PVS) may find that one PVS license per VM is checked out, rather than one PVS license per XenServer host. This may lead to a shortage of PVS licenses and an inability to provision VMs. Hotfix XS61E010 along with CTX135672 - Hotfix CPVS61016 (Version 6.1.16) - For Citrix Provisioning Services 6.1 - English resolves this issue.4. Attempts to shut down Microsoft Windows Vista and later VMs can cause intermittent blue screen errors, with a "STOP: 0x0000009f..." error message. Hotfix XS61E010 resolves this issue.5. Adding more than eight NICs to Microsoft Windows Vista and later VMs, using the xe CLI can lead to a blue screen error on reboot. Hotfix XS61E010 resolves this issue.6. Copying data to a Microsoft Windows 2003 VM can cause the VMs to hang and lead to a grey screen error. Hotfix XS61E010 resolves this issue.7. When Dynamic Memory Control (DMC) is enabled, attempts to migrate Microsoft Windows XP and later VMs using XenMotion can cause the VMs to hang and lead to blue screen error. Hotfix XS61E010 resolves this issue.8. When the Citrix Xen Guest Agent service is running, Cut and Paste will not work between a XenDesktop virtual desktop and the endpoint device. Hotfix XS61E010 resolves this issue.
Each of the issues below have also been resolved In Hotfix XS61E009 and XS61E0101. When the Citrix Xen Guest Agent service is running, Cut and Paste will not work between a XenDesktop virtual desktop and the endpoint device. Hotfix XS61E010 resolves this issue.2. Microsoft Windows XP and later VMs may hang during the boot process and may have to be forced to reboot. Hotfix XS61E010 resolves this issue.3. Attempting to install or upgrade the XenServer Tools on Microsoft Windows Vista and later VMs, which do not have access to a paravirtualized or an emulated network device can cause the installation process to hang. Hotfix XS61E010 resolves this issue.4. Manually installing the Legacy XenServer Tools without changing the device_id to 0001 can result in a "STOP: 0x0000007B..."error when rebooting a Windows VM. After installing Hotfix XS61E010, customers will not be able to manually install the Legacy XenServer Hotfix XS61E010 Tools by running xenlegacy.exe. When customers start the XenServer Tools installation process, the installwizard.msi launches automatically.5. Microsoft Volume Shadow Copy Services (VSS) (required for third party backup solutions) was unavailable on Microsoft Windows Server 2008 in the original version of XenServer 6.1.0. After installing Hotfix XS61E010, XenServer 6.1.0 customers will be able to take quiesced snapshots on Microsoft Windows Server 2003 and Windows Server 2008 VMs. For Windows Server 2008 R2, see the guidance that appears earlier in this section.
Some workarounds have been followed in relation to known issues (as discussed) – It is very important that the guidance provided in http://support.citrix.com/article/CTX135099is followed in these cases. This article gives a comprehensive overview on how to troubleshoot XenServer tools in general and particularly in relation to XenServer 6.1
A really key advantage of the Windows update model is that deployment of bug fixes and new features can now be delivered automatically and independently of any XenServer release. The Windows administrators and XenServer administrators can work independently which is more ideal for cloud based installations
At Citrix Services - we’re Citrix consultants, teachers and support engineers and we’re all about one thing: making sure you succeed.With our help, you’ll deploy high-performance, robust virtualization and networking projects, faster – with dramatically lower risk and higher return.The best Citrix architects and administrators are the ones who never stop learning – and Citrix Education is here to help you learn those skills.Citrix Consulting gives you direct access to our most experienced virtualization and networking experts.When it’s complex; when it’s mission-critical; when it’s big; That’s when Citrix consultants can really help.On your virtualization journey, you’ll want always-on support from people who really care about your success.There’s no better insurance for your Citrix investment than with Citrix Support.
Secrets of the Citrix Support Ninjas is a FREE eBook available next week.The eBook contains 40 insider troubleshooting tips for administrators.So the purpose of the eBook is to help administrators like you keep your Citrix deployments on track.We’ve collected some of their best tips and tricks for running robust Citrix environments and packaged them up into a free eBook.In it, you’ll discover some of the little-known tricks that our own support people use every day to tune, tweak, troubleshoot and test Citrix solutions. You may know a few of these tips. But you probably don’t know them all.And – you never know – you might discover just one that will change your life as an administrator.Let me give you a sneak peak now.
"Sometimes setting up a NetScaler may seem an impossible. What do you do when you hit a roadblock at an early stage?In this session, Ronan will step through the most common issues you can experience at an early stage of your NetScaler deployment, and how to diagnose them using on-box tools.During this session you will learn:- Troubleshooting basic setup issues- Health Checks- Introducing redundancy and removing single point of failure- Logging – what happened historically"
Please note that some images used in this presentation are c. Presenters Media, 2013 (All rights reserved)