So, maybe third time's the charm posting on the forums (hopefully the best forum for the question this time):
I'm copy/pasting this from the regular MS forums as suggested by a moderator. I view this as more of a host OS issue than a Hyper-V issue personally, but defer to the experts.
So, it's taken me a while to track down exactly what process was causing this. I've attached the first evidence I've found as to what process is causing this problem.
Since my account on this side isn't verified, the screenshot can be viewed here:
https://answers.microsoft.com/en-us/windows/forum/windows_10-performance/w10-1903-in-vm-hanging-with-system-process-using/0c30370e-823a-41f1-bdfc-37c685ee001b?tm=1572380956433
This is a virtual Windows 10 Pro 1903 installation on a Server Datacenter 2016 Hyper-V host.
Host has a single Xeon E3-1275 v6 quad-core processor. 64GB RAM. Host is installed on an SSD while the VMs and their storage are kept on a pair of traditional HDDs in RAID1. There are a total of 8 VMs running on this host. One is
the Linux-based virtual appliance for our on-premise Bitdefender security product, six are additional Windows Server VMs for file servers, print server, WSUS, a redundant DC, and some internal web-based applications. The last is this problem child Windows
10 VM, and it is the only one experiencing issues. All of the Windows-based VMs are Generation 2 and have access to a single vCPU. Memory has been assigned based to each VM on demonstrated need, at least initially.
Around 3 months ago (not sure exactly, but certainly not long after we approved the update to 1903) this VM began not checking in to WSUS and when I would look into the issue, Hyper-V manager would report the VM was using about 12% of the processor, which
translates into 100% in the virtual environment. Requests to remote in would time out, and connecting through Hyper-V Manager would show the lock screen which was marginally interactive, but it was unable to proceed through logon. The only way
to get it to respond is to reset the virtual machine, after which it starts up normally. However, it could freeze again in as little as 15 minutes, or it could take over a day, without any discernible pattern.
This system is used to host the only Java installation we have on campus, which it needs to access HVAC controls for our campus, as well as monitor some of our wireless access points. Other than the browser, and Java, there is no additional software
(except for that which I've downloaded for troubleshooting since the beginning of this issue). The system had been working fine for over a year with only regular updates being done. As it's a VM, there's no physical hardware for which drivers could
be out of date. The VM had been initially assigned 2GB of memory, however, that has since been incrementally increased to 8GB with no success to mitigate this issue.
Since the time between system startup and hang appear to be somewhat random, it's not usually possible to notice when it hangs. An individual in another forum asked about licensing, and, while we are licensed for this scenario, it would seem to be
a strange method by which to enforce licensing in the first place. In the case of the screenshot attached, it froze minutes after I left my desk for the day, but before the problem VM's lock screen came on. I left screen sharing software running
in the hopes of catching it in the act and finally succeeded.
Any suggestions for this one? Certainly it's possible to just reinstall from scratch, but if a solution exists, I'd rather find that than just keep nuking and paving the VM every time it decides to try to turn into a space heater.
Best wishes,
Roy