Quantcast
Channel: Hyper-V forum
Viewing all articles
Browse latest Browse all 8743

AD dependency of Hyper-V 2016 Cluster

$
0
0

Hi!

We currently have multiple Hyper-V 2016 clusters running in our company, all with SAN luns connected.

This morning it seems we have had a technical issue on storage level which still has to be figured out.

From 6:37:29 till 6:37:35 all our luns, on all our clusters and nodes reported following event (5120):

Cluster Shared Volume 'LUNNAME' ('LUNNAME') has entered a paused state because of

'STATUS_CONNECTION_DISCONNECTED(c000020c)'. All I/O will temporarily be queued until a path to the volume is reestablished.

After the connection to every lun was lost, we received the following critical event (1135), again from every node. 

Cluster node '99-001-203-s052' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node

having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network

adapters on this node. Also check for failures in any other network components to which the node is connected

such as hubs, switches, or bridges.

And finally - logically - followed by (1177):

The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the

cluster, or a failover of the witness disk. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network

adapter. Also check for failures in any other network components to which the node is connected

such as hubs, switches, or bridges.

However, since no network related events were found I do not understand why the cluster nodes went down after losing their storage. The only possible explanation for this behavior is a loss of authentication towards the domain controllers.

We have 2 physical and 2 virtual DC's, so this would mean that at this point all of the Hyper-V Hosts (32 hosts spread over 5 clusters) were authenticated to the virtual DC's and did not manage to re-authenticate to a physical one in a timely manner.

Moreover, it was in my understanding that combined use of  Network Service & CLIUSR account drastically reduced (negated?) the necessity of available DC's to stay online in case of emergency

Can anybody shed some light on this situation? 





Viewing all articles
Browse latest Browse all 8743

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>