Hi everyone
I've undertaken a project of making a Proxmox server for machine learning and remote gaming, but I'm sort of stuck getting the Nvidia drivers to work on my Ubuntu 22.04 VM. The plan is to have a few VM's configured for either machine learning or playing games using different GPUs.
I am both new to this forum and a bit new to Linux, so please bear with me for any mistakes, and point out if I'm missing some info
So far I have successfully passed trough my 1060 to a Windows VM following the guide, but I can't seem to get the K80 working properly. The VM has been set up with both the two available GPU's from the K80, but i had a similar error before with a single K80 GPU VM.
My system has a MSI Z590 MB, Intel 10850k, a gtx 1060, and a Tesla K80.
I have mainly used this guide as a reference:
https://3os.org/infrastructure/prox...virtual-machine-gpu-passthrough-configuration
The only thing I have done inside the VM so far is to use the inbuilt "Software & Updates" to install the Nvidia 470 display driver.
The main symptom arises by running the
command:
Although the GPU's are listed when running
...
The best clue I have is this part from the
command:
(same error on the other GPU; PCI bus 0000:02)
The best suggestions I could find scouring the internet was to enable "above 4G decoding", and disable "CSM", both of which I have done in the host BIOS.
Any help or clues would be appreciated, as can't really find much info about this issue.
Update:
I've read through the forums a bit, and I found this very informative post by Lefuneste:
https://forum.proxmox.com/threads/problem-with-gpu-passthrough.55918/post-471013
Where it was helpfully pointed out that by running
in the host, there should be a line with "vfio-pci" the line under the GPU PCIE adress, which I don't get. Instead, I get nothing the line under my GPU adresses. In fact, when running
, I get nothing. Does this mean that the Nvidia drivers are succesfully blocked from grabbing my GPU's, but the vfio fails to get it?
I've undertaken a project of making a Proxmox server for machine learning and remote gaming, but I'm sort of stuck getting the Nvidia drivers to work on my Ubuntu 22.04 VM. The plan is to have a few VM's configured for either machine learning or playing games using different GPUs.
I am both new to this forum and a bit new to Linux, so please bear with me for any mistakes, and point out if I'm missing some info
So far I have successfully passed trough my 1060 to a Windows VM following the guide, but I can't seem to get the K80 working properly. The VM has been set up with both the two available GPU's from the K80, but i had a similar error before with a single K80 GPU VM.
My system has a MSI Z590 MB, Intel 10850k, a gtx 1060, and a Tesla K80.
I have mainly used this guide as a reference:
https://3os.org/infrastructure/prox...virtual-machine-gpu-passthrough-configuration
The only thing I have done inside the VM so far is to use the inbuilt "Software & Updates" to install the Nvidia 470 display driver.
The main symptom arises by running the
Code:
nvidia-smi
Code:
No devices were found
Although the GPU's are listed when running
Code:
lspci -nnv
Code:
01:00.0 3D controller [0302]: NVIDIA Corporation GK210GL [Tesla K80] [10de:102d] (rev a1)
Subsystem: NVIDIA Corporation GK210GL [Tesla K80] [10de:106c]
Physical Slot: 0
Flags: bus master, fast devsel, latency 0, IRQ 16
Memory at c2000000 (32-bit, non-prefetchable) [size=16M]
Memory at 1000000000 (64-bit, prefetchable) [size=32M]
Capabilities: <access denied>
Kernel modules: nvidiafb, nouveau
02:00.0 3D controller [0302]: NVIDIA Corporation GK210GL [Tesla K80] [10de:102d] (rev a1)
Subsystem: NVIDIA Corporation GK210GL [Tesla K80] [10de:106c]
Physical Slot: 0-2
Flags: bus master, fast devsel, latency 0, IRQ 16
Memory at c1000000 (32-bit, non-prefetchable) [size=16M]
Memory at 1002000000 (64-bit, prefetchable) [size=32M]
Capabilities: <access denied>
Kernel modules: nvidiafb, nouveau
The best clue I have is this part from the
Code:
dmesg -w
Code:
[ 4.749614] resource sanity check: requesting [mem 0xc2700000-0xc36fffff], which spans more than PCI Bus 0000:01 [mem 0xc2000000-0xc2ffffff]
[ 4.749619] caller os_map_kernel_space.part.0+0x97/0xa0 [nvidia] mapping multiple BARs
[ 4.763172] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0xffff:1211)
[ 4.763299] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
(same error on the other GPU; PCI bus 0000:02)
*full log in .txt file
The best suggestions I could find scouring the internet was to enable "above 4G decoding", and disable "CSM", both of which I have done in the host BIOS.
Any help or clues would be appreciated, as can't really find much info about this issue.
Update:
I've read through the forums a bit, and I found this very informative post by Lefuneste:
https://forum.proxmox.com/threads/problem-with-gpu-passthrough.55918/post-471013
Where it was helpfully pointed out that by running
Code:
cat /proc/iomem
Code:
cat /proc/iomem | grep vfio
Attachments
Last edited: