Sunday, December 2, 2012

What can you do when Citrix XenServer 6.1 High Availibity (HA) malfunction ?

When I talked with Citrix XenServer customers, they commented that Citrix XenServer 6.1 is quite stable with an exception that High Availablity(HA) feature is enabled.

What is XenServer HA ?

Citrix XenServer High Availability(HA) is a feature that reboot your virtual machines on healthy XenServer hosts without a pool when one or more XenServer host physical failure occurs.

Sound good ? However, since this feature rely on datastore heartbeat (i.e. use a SAN LUN or NFS to check if hosts are still alive.)  Nightmare will come if this LUN fails. The consequence is that You cannot control all XenServer hosts in pool.

You will see something like below when you try to use XenCenter to connect:


 When you try to see the server console, you will get something like:

All networking settings gone.

If you try to issue xe command, you will get something like below :

This shows that XenServer is uncontrollable if HA heartbeat storage fail.

To recover from this problem, disable the HA feature on XenServer pool master by "xe host-emergency-ha-disable --force"


At server console, you will see the networking settings return.

This means XenServer pool master start up normal with HA.
When you try to use XenCenter, you can connect now.

All you have to do now is to recover your LUN or create a new LUN for datastore heartbeat. Then enable the HA again by following command:

"xe pool-ha-enable heartbeat-sr-uuid="

You can confirm the HA at XenCenter:


Since datastore heatbeat LUN is critical for HA to function and XenServer cannot accept 2 heartbeat LUNs like VMware vSphere. You are recommended to have a dedicated LUN (1 to 2GB is enough) for datastore heartbeat.

Hope this article save your life in datacenter.



No comments:

Post a Comment