I'm not sure how much help anyone is going to be but I've come across a very annoying issue I've managed to put down to pci passthrough, I've set up a Windows 10 VM and successfully passed through an Nvidia GTX 980 and have all of the drivers working and I access the system using Parsec, I've noticed at times that the VM will become unreachable and I receive notifications through CheckMK from the other 2 hosts in the cluster that the Host the VM is running on is no longer online, after doing some testing I've found if the VM is idling the host will be stable. However, when I put a load on the GPU such as running a game or encoding media on the GPU the Host will lock up between 5 minutes and 5 hours after I first put the GPU under load, checking the console I'm getting errors saying "NMI watchdog detected hard LOCKUP on cpu8" and "watchdog: BUG: soft lockup - CPU#16 stuck for 100s! [kworker/u40:0:870237]"
I've also attached an image of the console during a lockup showing these errors, I'm a little unsure what I can do to solve this so Hoping someone can help.
I've also attached an image of the console during a lockup showing these errors, I'm a little unsure what I can do to solve this so Hoping someone can help.