We have installed the NVIDIA-GRID-Linux-KVM-535.154.02-535.154.05-538.15 on our proxmox 8.1 installation and the card is well recognized in the "lspci" command and the installation is working without any errors if we use the self-signed branch.
But after the installation and the reboot the nvidia service are starting and immediately stopping with the following error message:
-- Boot 47eedfa450964271b1ef443e97f2e714 --
Apr 10 13:35:07 prox2 systemd[1]: Starting nvidia-vgpud.service - NVIDIA vGPU Daemon...
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: Verbose syslog connection opened
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: Started (810)
Apr 10 13:35:07 prox2 systemd[1]: Started nvidia-vgpud.service - NVIDIA vGPU Daemon.
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: Global settings:
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: Size: 16
Version 1
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: Homogeneous vGPUs: 1
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: vGPU types: 586
Apr 10 13:35:07 prox2 nvidia-vgpud[810]:
Apr 10 13:35:07 prox2 modprobe[837]: ERROR: could not insert 'nvidia': Key was rejected by service
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: error: failed to allocate client: 59
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: error: failed to read pGPU information: 9
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: error: failed to send vGPU configuration info to RM: 9
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: PID file unlocked.
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: PID file closed.
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: Shutdown (810)
Apr 10 13:35:07 prox2 systemd[1]: nvidia-vgpud.service: Main process exited, code=exited, status=9/n/a
Apr 10 13:35:07 prox2 systemd[1]: nvidia-vgpud.service: Failed with result 'exit-code'.
And for vgpu-manager:
-- Boot 3fe5a35ca04b4f659ae52a20b66e9f48 --
Apr 09 17:24:25 prox2 systemd[1]: Starting nvidia-vgpu-mgr.service - NVIDIA vGPU Manager Daemon...
Apr 09 17:24:25 prox2 systemd[1]: Started nvidia-vgpu-mgr.service - NVIDIA vGPU Manager Daemon.
Apr 09 17:24:25 prox2 nvidia-vgpu-mgr[833]: error: vmiop_env_log: Failed to initialize RM client: 0x59
Apr 09 17:24:25 prox2 systemd[1]: nvidia-vgpu-mgr.service: Main process exited, code=exited, status=1/FAILURE
Apr 09 17:24:25 prox2 systemd[1]: nvidia-vgpu-mgr.service: Failed with result 'exit-code'.
-- Boot 47eedfa450964271b1ef443e97f2e714 --
Apr 10 13:35:07 prox2 systemd[1]: Starting nvidia-vgpu-mgr.service - NVIDIA vGPU Manager Daemon...
Apr 10 13:35:07 prox2 systemd[1]: Started nvidia-vgpu-mgr.service - NVIDIA vGPU Manager Daemon.
Apr 10 13:35:07 prox2 nvidia-vgpu-mgr[811]: error: vmiop_env_log: Failed to initialize RM client: 0x59
Apr 10 13:35:07 prox2 systemd[1]: nvidia-vgpu-mgr.service: Main process exited, code=exited, status=1/FAILURE
Apr 10 13:35:07 prox2 systemd[1]: nvidia-vgpu-mgr.service: Failed with result 'exit-code'.
The nvidia card is the only graphics card in the system and we have a few other new hosts with the same hardware configuration; therefore we need a diagnose what is wrong with our configuration, because we used the whole description in "NVIDIA vGPU on Proxmox VE" and the informations from "https://wvthoog.nl/proxmox-7-vgpu-v2/".
In the first place the whole installation seems to work but actually it doesn´t work!
But after the installation and the reboot the nvidia service are starting and immediately stopping with the following error message:
-- Boot 47eedfa450964271b1ef443e97f2e714 --
Apr 10 13:35:07 prox2 systemd[1]: Starting nvidia-vgpud.service - NVIDIA vGPU Daemon...
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: Verbose syslog connection opened
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: Started (810)
Apr 10 13:35:07 prox2 systemd[1]: Started nvidia-vgpud.service - NVIDIA vGPU Daemon.
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: Global settings:
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: Size: 16
Version 1
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: Homogeneous vGPUs: 1
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: vGPU types: 586
Apr 10 13:35:07 prox2 nvidia-vgpud[810]:
Apr 10 13:35:07 prox2 modprobe[837]: ERROR: could not insert 'nvidia': Key was rejected by service
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: error: failed to allocate client: 59
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: error: failed to read pGPU information: 9
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: error: failed to send vGPU configuration info to RM: 9
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: PID file unlocked.
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: PID file closed.
Apr 10 13:35:07 prox2 nvidia-vgpud[810]: Shutdown (810)
Apr 10 13:35:07 prox2 systemd[1]: nvidia-vgpud.service: Main process exited, code=exited, status=9/n/a
Apr 10 13:35:07 prox2 systemd[1]: nvidia-vgpud.service: Failed with result 'exit-code'.
And for vgpu-manager:
-- Boot 3fe5a35ca04b4f659ae52a20b66e9f48 --
Apr 09 17:24:25 prox2 systemd[1]: Starting nvidia-vgpu-mgr.service - NVIDIA vGPU Manager Daemon...
Apr 09 17:24:25 prox2 systemd[1]: Started nvidia-vgpu-mgr.service - NVIDIA vGPU Manager Daemon.
Apr 09 17:24:25 prox2 nvidia-vgpu-mgr[833]: error: vmiop_env_log: Failed to initialize RM client: 0x59
Apr 09 17:24:25 prox2 systemd[1]: nvidia-vgpu-mgr.service: Main process exited, code=exited, status=1/FAILURE
Apr 09 17:24:25 prox2 systemd[1]: nvidia-vgpu-mgr.service: Failed with result 'exit-code'.
-- Boot 47eedfa450964271b1ef443e97f2e714 --
Apr 10 13:35:07 prox2 systemd[1]: Starting nvidia-vgpu-mgr.service - NVIDIA vGPU Manager Daemon...
Apr 10 13:35:07 prox2 systemd[1]: Started nvidia-vgpu-mgr.service - NVIDIA vGPU Manager Daemon.
Apr 10 13:35:07 prox2 nvidia-vgpu-mgr[811]: error: vmiop_env_log: Failed to initialize RM client: 0x59
Apr 10 13:35:07 prox2 systemd[1]: nvidia-vgpu-mgr.service: Main process exited, code=exited, status=1/FAILURE
Apr 10 13:35:07 prox2 systemd[1]: nvidia-vgpu-mgr.service: Failed with result 'exit-code'.
The nvidia card is the only graphics card in the system and we have a few other new hosts with the same hardware configuration; therefore we need a diagnose what is wrong with our configuration, because we used the whole description in "NVIDIA vGPU on Proxmox VE" and the informations from "https://wvthoog.nl/proxmox-7-vgpu-v2/".
In the first place the whole installation seems to work but actually it doesn´t work!