vSAN Datastore: Storage Fault has occurred

I recently faced an issue where my vSAN datastore was “incompatible” with each of my Storage Policies (All-Flash) while the VM objects were consistently distributed across the hosts as expected. This vSAN datastore incompatibility reason was “Datastore does not match current VM policy. Storage Fault has occurred.” However, vSAN health was all green, no physical disk or component had fail. Let’s see what was wrong and how you can get rid of that issue.

vSAN Datastore not compatible

First of all, let’s put the context, we are talking about an All-Flash stretched cluster, thus requiring a Witness ESXi host (nested or physical). This Witness host should (must) be deployed in a 3rd party cluster, appart from the vSAN cluster and in a 3rd physical site (separated from the fault domains). Regarding the requirements, nothing special, more information can be found here.

Now, when you deploy the Witness host OVA, the wizard ask you to select the VM placement, the compute resource, infrastructure size, target storage and network. When the Witness OVA is deployed, you just have to set it in the Fault Domain configuration and you’re ready to go with your Stretched Cluster.

But here’s the thing – who checked that the witness VM has been deployed on All-Flash storage? After all, the Witness host is a member of the cluster, when an interlink failure occurs, vSAN forms a cluster with the Preferred site and the Witness, it has to have the same configuration, at least the same drive type (hybrid/all-flash). If you look at the default configuration, when deployed on a non-flash storage:

Witness Disks

The Cache device is marked as Flash, but the Capacities as HDD, making our Witness disk group Hybrid:

Hybrid

This is why I said it was tricky, it is easy to forget that the 3rd party (out of vSAN/VxRail cluster) does not necessarily have flash devices. Indeed, by default only the Witness cache device is marked as Flash, no matter which kind of underlying storage is in use.

Here are the required steps needed, in order to get rid of this and make your datastore “Compatible” with your Storage Policies again:

  • Put the Witness host in Maintenance Mode
  • Unmount the Witness Disk Group – Cluster > Configure > vSAN > Disk Management:
Unmount
  • Mark all the devices as Flash – ESXi host > Configure > Storage Devices:
Set Flash
  • Go back to Cluster > Configure > vSAN > Disk Management and mount the Disk Group:
Mount
  • Then select: Recreate from the menu:
Recreat
  • When the recreation is done, you should see your Witness Disk Group displayed as All-Flash:
Healthy
  • Get the host back from Maintenance Mode and check Skyline Health, some objects may have to resync, you can do this immediately or schedule it, depending on the load in your environment:

Now going back to your Storage Policy management, you can validate that the vSAN datastore is showing “Compatible:”

vSAN Datastore Compatible

Latest Posts:

Leave a Reply

Your email address will not be published. Required fields are marked *

*

code