Hi, I'm currently battling with nvidia gpu drivers. The issue is -- even though I've successfully was able to passthrough the gpu, no nvidia drivers would work. I'm in dead end rn.
Issue. VM with gpu won't load drivers during kernel boot:
Nov 20 23:13:00 gputest kernel: nvidia: loading...
I've undertaken a project of making a Proxmox server for machine learning and remote gaming, but I'm sort of stuck getting the Nvidia drivers to work on my Ubuntu 22.04 VM. The plan is to have a few VM's configured for either machine learning or playing games using different GPUs.
we ran into the following problem multiple times now. The PCI-e GPU gets stuck and is unavailable, on the host and, of course, any virtual machine.
Setup: a Proxmox host with a NVIDIA Tesla T4:
# lspci | grep NVIDIA
37:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev...
I can pass through 4 individual Tesla P100's to 4 VMs but when combining to pass through any number above 1 i get the following error when running - dmesg | grep NVRM
1 of four works, but any amount over 1 the below output is produced.
admin@gpu-host:~$ dmesg | grep NVRM
[ 4.550588] NVRM...
In my Proxmox VE 4.1 installations disabling KVM hardware virtualization fixes an nvidia-smi error message, and I would like to understand why. Here is the error message that results from running nvidia-smi on the Linux guest with KVM hardware virtualization enabled:
Unable to determine the...