Quantcast
Channel: Hyper-V forum
Viewing all articles
Browse latest Browse all 8743

CSVs fail when nodes rejoin the cluster

$
0
0

Going to jump right in because I'm just completely stuck with what's going on and Microsoft has been zero help under our SA contract.

I have an eight (8) node Server 2016 Hyper-V cluster.  The nodes are split between two Dell chassis, if that matters, with a dedicated host management network/private network, iSCSI network and fiber channel connection to each blade using individual switches in the chassis.  The CSVs consist of six (6) iSCSI all-flash backed LUNs and six (6) FC spinning-disk backed LUNs, with one iSCSI flash LUN for a quorum disk.  Most of the VMs are held on the iSCSI disks, for performance reasons, with data VHDs being held on the FC LUNs.

When a machine is properly paused/drained to be rebooted, for any reason (patching, maintenance, etc.), when it rejoins the cluster after being rebooted, all of the cluster disks in the GUI start to flash.  The stats, such as the size of the disk, disappear and then reappear quickly.  In the events, on the rebooted node, I can start to see "'STATUS_BAD_NETWORK_PATH(c00000be)'. All I/O will temporarily be queued until a path to the volume is reestablished." error messages.  This event throws for seemingly random CSVs, as it could be all of them, or a subset of disks, but it never seems to be consistent.  Mind you, the node is still paused, it has all of its networking capabilities and is properly communicating with the other cluster nodes.

At this point, the other nodes begin to throw errors on the same disks the rebooted node was complaining about, especially the nodes that are the owner of those disks.  Then, the majority of times, the CSV will fail and run through recovery options to come back up on another node, causing several VMs to crash in the process.

One last odity in this is that even though the rebooted node complains about a BAD_NETWORK_PATH for some of the FC disks, none of the FC data disks have ever failed, only the iSCSI all-flash LUNs.  This is more detrimental to me though, because the majority of OS disks are held on these LUNS.

Any help is greatly appreciated, thanks!


Viewing all articles
Browse latest Browse all 8743

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>