I am trying to install a Tesla K80 on Proxmox 6.1-7 (Am too scared to upgrade to latest and greatest...).
I was able to pass both cards (the K80 shows up as two GPUs in lspci) to a WindowsVM without 4G decoding.
The VM booted fine and I went on to install the required Nvidia Drivers... The machine rebooted, and is now unwilling to boot. If I remove 4G decoding I get the error below.
PS: This box and its virtual machines have been running a bunch of Nvidia Quadros without issues so I assume that the issue is K80 related.
Here is the config:
Error when starting the VM:
Here is the error that I am getting from console (not much IMHO - I manually patched the E1000 driver):
This is the output of dmesg:
lspci correctly detects the K80:
vfio.conf for PCI passthrough:
I was able to pass both cards (the K80 shows up as two GPUs in lspci) to a WindowsVM without 4G decoding.
The VM booted fine and I went on to install the required Nvidia Drivers... The machine rebooted, and is now unwilling to boot. If I remove 4G decoding I get the error below.
PS: This box and its virtual machines have been running a bunch of Nvidia Quadros without issues so I assume that the issue is K80 related.
Here is the config:
Code:
qm config 111
agent: 1
args: -machine pc,max-ram-below-4g=4G
bootdisk: scsi0
cores: 15
cpu: host,flags=+pdpe1gb
hostpci0: 03:00,pcie=1,rombar=0
hostpci1: 06:00.0,pcie=1
hostpci2: 07:00.0,pcie=1
ide2: none,media=cdrom
machine: q35
memory: 40000
name: Abaqus2020
net0: virtio=E2:47:D3:F8:55:16,bridge=vmbr0,firewall=1
numa: 1
ostype: win10
scsi0: ssdzfs:vm-111-disk-0,discard=on,iothread=1,size=150G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=b4b5c719-f41e-4122-87d3-ea156a0f6677
sockets: 2
tablet: 1
usb0: host=3-1,usb3=1
vmgenid: a0af1fb9-4037-46e4-bf9a-2ff90781c941
Error when starting the VM:
Code:
kvm:/usr/share/qemu-server/pve-q35-4.0.cfg:1: Bus 'pcie.0' not found
start failed: QEMU exited with code 1
Here is the error that I am getting from console (not much IMHO - I manually patched the E1000 driver):
Code:
Mar 15 12:46:10 proxbox kernel: [ 854.292859] device tap111i0 entered promiscuous mode
Mar 15 12:46:11 proxbox kernel: [ 854.339792] fwbr111i0: port 1(fwln111i0) entered blocking state
Mar 15 12:46:11 proxbox kernel: [ 854.339811] fwbr111i0: port 1(fwln111i0) entered disabled state
Mar 15 12:46:11 proxbox kernel: [ 854.339934] device fwln111i0 entered promiscuous mode
Mar 15 12:46:11 proxbox kernel: [ 854.340008] fwbr111i0: port 1(fwln111i0) entered blocking state
Mar 15 12:46:11 proxbox kernel: [ 854.340013] fwbr111i0: port 1(fwln111i0) entered forwarding state
Mar 15 12:46:11 proxbox kernel: [ 854.345126] vmbr0: port 2(fwpr111p0) entered blocking state
Mar 15 12:46:11 proxbox kernel: [ 854.345131] vmbr0: port 2(fwpr111p0) entered disabled state
Mar 15 12:46:11 proxbox kernel: [ 854.345219] device fwpr111p0 entered promiscuous mode
Mar 15 12:46:11 proxbox kernel: [ 854.345265] vmbr0: port 2(fwpr111p0) entered blocking state
Mar 15 12:46:11 proxbox kernel: [ 854.345268] vmbr0: port 2(fwpr111p0) entered forwarding state
Mar 15 12:46:11 proxbox kernel: [ 854.350491] fwbr111i0: port 2(tap111i0) entered blocking state
Mar 15 12:46:11 proxbox kernel: [ 854.350495] fwbr111i0: port 2(tap111i0) entered disabled state
Mar 15 12:46:11 proxbox kernel: [ 854.350598] fwbr111i0: port 2(tap111i0) entered blocking state
Mar 15 12:46:11 proxbox kernel: [ 854.350601] fwbr111i0: port 2(tap111i0) entered forwarding state
Mar 15 12:46:11 proxbox kernel: [ 854.724182] fwbr111i0: port 2(tap111i0) entered disabled state
Mar 15 12:46:11 proxbox kernel: [ 854.746326] fwbr111i0: port 1(fwln111i0) entered disabled state
Mar 15 12:46:11 proxbox kernel: [ 854.746409] vmbr0: port 2(fwpr111p0) entered disabled state
Mar 15 12:46:11 proxbox kernel: [ 854.746774] device fwln111i0 left promiscuous mode
Mar 15 12:46:11 proxbox kernel: [ 854.746783] fwbr111i0: port 1(fwln111i0) entered disabled state
Mar 15 12:46:11 proxbox kernel: [ 854.765504] device fwpr111p0 left promiscuous mode
Mar 15 12:46:11 proxbox kernel: [ 854.765512] vmbr0: port 2(fwpr111p0) entered disabled state
This is the output of dmesg:
Code:
grep -e DMAR -e IOMMU
[ 0.012706] ACPI: DMAR 0x00000000B9E1A908 000108 (v01 DELL CBX3 00000001 INTL 20091013)
[ 0.280077] DMAR: IOMMU enabled
[ 0.517030] DMAR: Host address width 46
[ 0.517032] DMAR: DRHD base: 0x000000f7ffc000 flags: 0x0
[ 0.517039] DMAR: dmar0: reg_base_addr f7ffc000 ver 1:0 cap d2078c106f0466 ecap f020df
[ 0.517041] DMAR: DRHD base: 0x000000f3ffd000 flags: 0x0
[ 0.517046] DMAR: dmar1: reg_base_addr f3ffd000 ver 1:0 cap d2008c10ef0466 ecap f0205b
[ 0.517049] DMAR: DRHD base: 0x000000f3ffc000 flags: 0x1
[ 0.517053] DMAR: dmar2: reg_base_addr f3ffc000 ver 1:0 cap d2078c106f0466 ecap f020df
[ 0.517056] DMAR: RMRR base: 0x000000bafae000 end: 0x000000bafbcfff
[ 0.517058] DMAR: ATSR flags: 0x0
[ 0.517060] DMAR: RHSA base: 0x000000f3ffc000 proximity domain: 0x0
[ 0.517062] DMAR: RHSA base: 0x000000f7ffc000 proximity domain: 0x1
[ 0.517066] DMAR-IR: IOAPIC id 3 under DRHD base 0xf7ffc000 IOMMU 0
[ 0.517068] DMAR-IR: IOAPIC id 1 under DRHD base 0xf3ffc000 IOMMU 2
[ 0.517070] DMAR-IR: IOAPIC id 2 under DRHD base 0xf3ffc000 IOMMU 2
[ 0.517072] DMAR-IR: HPET id 0 under DRHD base 0xf3ffc000
[ 0.517074] DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out bit.
[ 0.517075] DMAR-IR: Use 'intremap=no_x2apic_optout' to override the BIOS setting.
[ 0.517909] DMAR-IR: Enabled IRQ remapping in xapic mode
[ 1.622623] DMAR: dmar1: Using Queued invalidation
[ 1.622633] DMAR: dmar2: Using Queued invalidation
[ 1.664133] DMAR: Intel(R) Virtualization Technology for Directed I/O
lspci correctly detects the K80:
Code:
# lspci | grep NVIDIA
03:00.0 VGA compatible controller: NVIDIA Corporation GK104GL [Quadro K4200] (rev a1)
03:00.1 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1)
06:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
07:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
# lspci -n -s 06:00
06:00.0 0302: 10de:102d (rev a1)
# lspci -n -s 07:00
07:00.0 0302: 10de:102d (rev a1)
vfio.conf for PCI passthrough:
Code:
more /etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:13f1,10de:0fbb disable_vga=1
options vfio-pci ids=10de:0dd8,10de:0be9 disable_vga=1
options vfio-pci ids=10de:13bb,10de:0fbc disable_vga=1
options vfio-pci ids=10de:0ffe,10de:0e1b disable_vga=1
options vfio-pci ids=10de:1eb1,10de:10f8,10de:1ad8,10de:1ad9 disable_vga=1
options vfio-pci ids=10de:102d
Last edited: