Dual GPU Nvidia L40s Support on single VM

Godson

New Member
Dec 9, 2024
15
0
1
Proxmox hosted in R7625, and we have 2 GPU Nvidia L40s, we Need to utilize both the L40S in the single VM.

How we have to configure it?

Kindly, provide the documentation link the above scenario?


@thousif_ahamed
 
Thank you @leesteken However our requirement is 2 numbers of L40S nVidia GPU to be presented to one single VM via Proxmox. I see we can only map one GPU to one Virtual Machine. We want to map 2 physical GPU to one VM. Let us know the possiblity. @Godson

 
Yes, we will try to provide any errors if we encounter them. Currently, we do not have access to the device. We expect to set up the hardware next week, at which point we may face some issues."

Regards,
Thousif
 
@leesteken Adding further we can be able to see the device capability in the Proxmox 3Dcontroller.

We are not able to configure PCIE passthrough or VGPU Profiles on a windows machine both the devices are not visible

e1:00.0 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
Subsystem: NVIDIA Corporation AD102GL [L40S]
Flags: bus master, fast devsel, latency 0, IRQ 680, NUMA node 1, IOMMU group 73
Memory at 98000000 (32-bit, non-prefetchable) [size=16M]
Memory at 13000000000 (64-bit, prefetchable) [size=64G]
Memory at 12800000000 (64-bit, prefetchable) [size=32M]
Capabilities: [60] Power Management version 3
Capabilities: [68] Null
Capabilities: [78] Express Legacy Endpoint, MSI 00
Capabilities: [b4] Vendor Specific Information: Len=14 <?>
Capabilities: [c8] MSI-X: Enable+ Count=6 Masked-
Capabilities: [100] Virtual Channel
Capabilities: [250] Latency Tolerance Reporting
Capabilities: [258] L1 PM Substates
Capabilities: [128] Power Budgeting <?>
Capabilities: [420] Advanced Error Reporting
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] Secondary PCI Express
Capabilities: [bb0] Physical Resizable BAR
Capabilities: [bcc] Single Root I/O Virtualization (SR-IOV)
Capabilities: [c14] Alternative Routing-ID Interpretation (ARI)
Capabilities: [c1c] Physical Layer 16.0 GT/s <?>
Capabilities: [d00] Lane Margining at the Receiver <?>
Capabilities: [e00] Data Link Feature <?>
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia_vgpu_vfio, nvidia


21:00.0 3D controller: NVIDIA Corporation AD102GL [L40S] (rev a1)
Subsystem: NVIDIA Corporation AD102GL [L40S]
Flags: fast devsel, IRQ 680, NUMA node 0, IOMMU group 64
Memory at 9a000000 (32-bit, non-prefetchable) [size=16M]
Memory at 11000000000 (64-bit, prefetchable) [size=64G]
Memory at 10800000000 (64-bit, prefetchable) [size=32M]
Capabilities: [60] Power Management version 3
Capabilities: [68] Null
Capabilities: [78] Express Legacy Endpoint, MSI 00
Capabilities: [b4] Vendor Specific Information: Len=14 <?>
Capabilities: [c8] MSI-X: Enable- Count=6 Masked-
Capabilities: [100] Virtual Channel
Capabilities: [250] Latency Tolerance Reporting
Capabilities: [258] L1 PM Substates
Capabilities: [128] Power Budgeting <?>
Capabilities: [420] Advanced Error Reporting
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] Secondary PCI Express
Capabilities: [bb0] Physical Resizable BAR
Capabilities: [bcc] Single Root I/O Virtualization (SR-IOV)
Capabilities: [c14] Alternative Routing-ID Interpretation (ARI)
Capabilities: [c1c] Physical Layer 16.0 GT/s <?>
Capabilities: [d00] Lane Margining at the Receiver <?>
Capabilities: [e00] Data Link Feature <?>
Kernel modules: nvidiafb, nouveau, nvidia_vgpu_vfio, nvidia
 
We are not able to configure PCIE passthrough or VGPU Profiles on a windows machine both the devices are not visible
I have no experience with NVidia or VGPU. Your one sentence about "not able to" and two conflicting technologies (PCIe passthrough and vGPU don't mix) does not really provide any information (much like the start of this thread). If you provide your VM configuration and information about what you changed to the default Proxmox installation, maybe someone else here can help you.
 
@leesteken We are neither able to configure the L40s as PCIe Passthrough or as vGPU. Both ways its failing. Although we are able to see the cards are capable of vGPU profiles. Can you guide us on both ways to get this done ?

@Godson
 
@leesteken we can't bind the device to Windows machine

error writing '0000:e1:00.0' to '/sys/bus/pci/drivers/vfio-pci/bind': Device or resource busy
TASK ERROR: Cannot bind 0000:e1:00.0 to vfio

any idea how it works ?

@Godson

Lukas Moravek

 
Last edited:
Hi All,

Currently, we are facing an issue with the vGPU not showing.

We did follow the admin guide and installed the vgpu drivers NVIDIA-GRID-Linux-KVM-570.124.03-570. on proxmox. The profiles are not visible. Any idea ?

nvidia-smi vgpu -s

GPU 00000000:21:00.0
No vGPUs found on this device
GPU 00000000:E1:00.0
No vGPUs found on this device

Regards,
Thousif

@Godson
 
Hi,

Did you turn on SR-IOV in BIOS?

I assigned 2 gpus to resources mapping in datacenter.

1765262081746.png



And I added 2 gpus to VM.
1765262424227.png



Finally, I could see 2 gpus by nvidia-smi in ubuntu 24.04 server VM.
1765262394475.png