Passthrough PCIe 4x 2.5gbe card not fully working

Graxo

New Member
Jun 23, 2024
10
0
1
Hello,

Im in need of some help, i hope someone has experience with my issue.

I cant seem to passthrough 2 nics to my OPNSense VM. I tried mapping them and tried it without mapping them. The PCIe card has 4 nics. 2 of them are in a bond (bond0) for VM traffic (this works fine). The management nic is one on the mobo, also works fine. So the issue is in the passthrough of the 2 nics for OPNSense vm.
If there is more info needed, please let me know.

Server Hardware:
Mobo: PRIME H610I-PLUS D4-CSM (Bios ver. 3212)
CPU: Intel® Core™ i3-12100 Processor
Nic: KALEA-INFORMATIQUE PCIe-card 2.5 x4 LAN Gigabit (https://www.amazon.nl/dp/B0BJW3H962)
IOMMU and/or Vt-d is enabled in the BIOS.

Logging when starting the vm:
Code:
Jun 23 09:59:36 pve-fw pvedaemon[2233]: start VM 103: UPID:pve-fw:000008B9:00002AB8:6677D5E8:qmstart:103:root@pam:
Jun 23 09:59:36 pve-fw kernel: igc 0000:05:00.0 enp5s0: PHC removed
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0: AER: Multiple Correctable error message received from 0000:02:03.0
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0:   device [8086:460d] error status/mask=00008000/00002000
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0:    [15] HeaderOF
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:02:03.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:02:03.0:   device [12d8:2608] error status/mask=00000040/00002000
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:02:03.0:    [ 6] BadTLP
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:02:03.0: AER:   Error of this Agent is reported first
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0: AER: Multiple Uncorrectable (Non-Fatal) error message received from 0000:00:01.0
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Requester ID)
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0:   device [8086:460d] error status/mask=00100000/00010000
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0:    [20] UnsupReq               (First)
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0: AER:   TLP Header: 30000000 02180032 00000000 00000000

INFO: task irq/122-aerdrv:92 blocked for more than 122 seconds.
Jun 23 10:01:54 pve-fw kernel:       Tainted: P           O       6.8.4-2-pve #1
Jun 23 10:01:54 pve-fw kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 23 10:01:54 pve-fw kernel: task:irq/122-aerdrv  state:D stack:0     pid:92    tgid:92    ppid:2      flags:0x00004000
Jun 23 10:01:54 pve-fw kernel: Call Trace:
Jun 23 10:01:54 pve-fw kernel:  <TASK>
Jun 23 10:01:54 pve-fw kernel:  __schedule+0x401/0x15e0
Jun 23 10:01:54 pve-fw kernel:  schedule+0x33/0x110
Jun 23 10:01:54 pve-fw kernel:  schedule_preempt_disabled+0x15/0x30
Jun 23 10:01:54 pve-fw kernel:  __mutex_lock.constprop.0+0x3f8/0x7a0
Jun 23 10:01:54 pve-fw kernel:  __mutex_lock_slowpath+0x13/0x20
Jun 23 10:01:54 pve-fw kernel:  mutex_lock+0x3c/0x50
Jun 23 10:01:54 pve-fw kernel:  report_slot_reset+0x23/0xa0
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_report_slot_reset+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  __pci_walk_bus+0x71/0xe0
Jun 23 10:01:54 pve-fw kernel:  pci_walk_bus+0x10/0x20
Jun 23 10:01:54 pve-fw kernel:  pcie_do_recovery+0x20b/0x3d0
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_aer_root_reset+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  aer_process_err_devices+0x17a/0x1c0
Jun 23 10:01:54 pve-fw kernel:  aer_isr+0x1b5/0x1e0
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_irq_thread_fn+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  irq_thread_fn+0x21/0x70
Jun 23 10:01:54 pve-fw kernel:  irq_thread+0xf8/0x1c0
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_irq_thread_dtor+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_irq_thread+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  kthread+0xef/0x120
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_kthread+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  ret_from_fork+0x44/0x70
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_kthread+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  ret_from_fork_asm+0x1b/0x30
Jun 23 10:01:54 pve-fw kernel:  </TASK>
Jun 23 10:01:54 pve-fw kernel: INFO: task kworker/2:2:396 blocked for more than 122 seconds.
Jun 23 10:01:54 pve-fw kernel:       Tainted: P           O       6.8.4-2-pve #1
Jun 23 10:01:54 pve-fw kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 23 10:01:54 pve-fw kernel: task:kworker/2:2     state:D stack:0     pid:396   tgid:396   ppid:2      flags:0x00004000
Jun 23 10:01:54 pve-fw kernel: Workqueue: events linkwatch_event
Jun 23 10:01:54 pve-fw kernel: Call Trace:
Jun 23 10:01:54 pve-fw kernel:  <TASK>
Jun 23 10:01:54 pve-fw kernel:  __schedule+0x401/0x15e0
Jun 23 10:01:54 pve-fw kernel:  schedule+0x33/0x110
Jun 23 10:01:54 pve-fw kernel:  schedule_preempt_disabled+0x15/0x30
Jun 23 10:01:54 pve-fw kernel:  __mutex_lock.constprop.0+0x3f8/0x7a0
Jun 23 10:01:54 pve-fw kernel:  ? finish_task_switch.isra.0+0x8c/0x310
Jun 23 10:01:54 pve-fw kernel:  ? add_timer_on+0xf9/0x150
Jun 23 10:01:54 pve-fw kernel:  __mutex_lock_slowpath+0x13/0x20
Jun 23 10:01:54 pve-fw kernel:  mutex_lock+0x3c/0x50
Jun 23 10:01:54 pve-fw kernel:  rtnl_lock+0x15/0x20
Jun 23 10:01:54 pve-fw kernel:  linkwatch_event+0x12/0x40
Jun 23 10:01:54 pve-fw kernel:  process_one_work+0x16a/0x350
Jun 23 10:01:54 pve-fw kernel:  worker_thread+0x306/0x440
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_worker_thread+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  kthread+0xef/0x120
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_kthread+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  ret_from_fork+0x44/0x70
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_kthread+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  ret_from_fork_asm+0x1b/0x30
Jun 23 10:01:54 pve-fw kernel:  </TASK>
Jun 23 10:01:54 pve-fw kernel: INFO: task pvestatd:1468 blocked for more than 122 seconds.
Jun 23 10:01:54 pve-fw kernel:       Tainted: P           O       6.8.4-2-pve #1
Jun 23 10:01:54 pve-fw kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 23 10:01:54 pve-fw kernel: task:pvestatd        state:D stack:0     pid:1468  tgid:1468  ppid:1      flags:0x00000002
Jun 23 10:01:54 pve-fw kernel: Call Trace:
Jun 23 10:01:54 pve-fw kernel:  <TASK>
Jun 23 10:01:54 pve-fw kernel:  __schedule+0x401/0x15e0
Jun 23 10:01:54 pve-fw kernel:  schedule+0x33/0x110
Jun 23 10:01:54 pve-fw kernel:  schedule_preempt_disabled+0x15/0x30
Jun 23 10:01:54 pve-fw kernel:  __mutex_lock.constprop.0+0x3f8/0x7a0
Jun 23 10:01:54 pve-fw kernel:  ? filemap_get_read_batch+0x149/0x280
Jun 23 10:01:54 pve-fw kernel:  __mutex_lock_slowpath+0x13/0x20
Jun 23 10:01:54 pve-fw kernel:  mutex_lock+0x3c/0x50
Jun 23 10:01:54 pve-fw kernel:  __netlink_dump_start+0x76/0x2a0
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_rtnl_dump_all+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  rtnetlink_rcv_msg+0x280/0x3c0
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_rtnl_dump_all+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  netlink_rcv_skb+0x5a/0x110
Jun 23 10:01:54 pve-fw kernel:  rtnetlink_rcv+0x15/0x30
Jun 23 10:01:54 pve-fw kernel:  netlink_unicast+0x1b0/0x2a0
Jun 23 10:01:54 pve-fw kernel:  netlink_sendmsg+0x214/0x470
Jun 23 10:01:54 pve-fw kernel:  __sys_sendto+0x21b/0x230
Jun 23 10:01:54 pve-fw kernel:  __x64_sys_sendto+0x24/0x40
Jun 23 10:01:54 pve-fw kernel:  do_syscall_64+0x84/0x180
Jun 23 10:01:54 pve-fw kernel:  ? do_syscall_64+0x93/0x180
Jun 23 10:01:54 pve-fw kernel:  ? irqentry_exit+0x43/0x50
Jun 23 10:01:54 pve-fw kernel:  ? exc_page_fault+0x94/0x1b0
Jun 23 10:01:54 pve-fw kernel:  entry_SYSCALL_64_after_hwframe+0x73/0x7b
Jun 23 10:01:54 pve-fw kernel: RIP: 0033:0x78eaa9b35b93
Jun 23 10:01:54 pve-fw kernel: RSP: 002b:00007ffffaa0dfe8 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
Jun 23 10:01:54 pve-fw kernel: RAX: ffffffffffffffda RBX: 00007ffffaa0f0d0 RCX: 000078eaa9b35b93
Jun 23 10:01:54 pve-fw kernel: RDX: 0000000000000014 RSI: 00007ffffaa0f0d0 RDI: 0000000000000008
Jun 23 10:01:54 pve-fw kernel: RBP: 00007ffffaa0f120 R08: 00007ffffaa0f074 R09: 000000000000000c
Jun 23 10:01:54 pve-fw kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000008
Jun 23 10:01:54 pve-fw kernel: R13: 00007ffffaa0f1c8 R14: 00007ffffaa0f1d0 R15: 00005b11b26142a0
Jun 23 10:01:54 pve-fw kernel:  </TASK>

File: /etc/default/grub
Code:
root@pve-fw:~# cat /etc/default/grub
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_port_pm=off pcie_acs_override=downstream,multifunction"
GRUB_CMDLINE_LINUX=""

Changed:
From: GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_port_pm=off pcie_acs_override=downstream,multifunction"
To: GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"

Code:
Jun 23 11:11:00 pve-fw pvedaemon[2107]: start VM 103: UPID:pve-fw:0000083B:00002772:6677E6A4:qmstart:103:root@pam:
Jun 23 11:11:00 pve-fw pvedaemon[1480]: <root@pam> starting task UPID:pve-fw:0000083B:00002772:6677E6A4:qmstart:103:root@pam:
Jun 23 11:11:00 pve-fw kernel: igc 0000:05:00.0 enp5s0: PHC removed
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0: AER: Multiple Correctable error message received from 0000:02:03.0
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0:   device [8086:460d] error status/mask=0000a000/00002000
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0:    [15] HeaderOF
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Transmitter ID)
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0:   device [12d8:2608] error status/mask=00001141/00002000
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0:    [ 0] RxErr                  (First)
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0:    [ 6] BadTLP
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0:    [ 8] Rollover
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0:    [12] Timeout
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0: AER:   Error of this Agent is reported first
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0: AER: Multiple Uncorrectable (Non-Fatal) error message received from 0000:00:01.0
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Requester ID)
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0:   device [8086:460d] error status/mask=00100000/00010000
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0:    [20] UnsupReq               (First)
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0: AER:   TLP Header: 30000000 02180032 00000000 00000000
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0: pciehp: Slot(0-2): Link Down
Jun 23 11:11:00 pve-fw kernel: pci 0000:05:00.0: Unable to change power state from unknown to D0, device inaccessible
Jun 23 11:11:00 pve-fw kernel: bond0: (slave enp3s0): link status definitely down, disabling slave
Jun 23 11:11:01 pve-fw kernel: pci 0000:05:00.0: timed out waiting for pending transaction; performing function level reset anyway
Jun 23 11:11:03 pve-fw kernel: pci 0000:05:00.0: not ready 1023ms after FLR; waiting
Jun 23 11:11:04 pve-fw kernel: pci 0000:05:00.0: not ready 2047ms after FLR; waiting
Jun 23 11:11:06 pve-fw kernel: pci 0000:05:00.0: not ready 4095ms after FLR; waiting
Jun 23 11:11:11 pve-fw kernel: pci 0000:05:00.0: not ready 8191ms after FLR; waiting
Jun 23 11:11:19 pve-fw kernel: pci 0000:05:00.0: not ready 16383ms after FLR; waiting
Jun 23 11:11:36 pve-fw kernel: pci 0000:05:00.0: not ready 32767ms after FLR; waiting
Jun 23 11:12:12 pve-fw kernel: pci 0000:05:00.0: not ready 65535ms after FLR; giving up
Jun 23 11:12:13 pve-fw kernel: pci 0000:05:00.0: not ready 1023ms after bus reset; giving up
Jun 23 11:12:15 pve-fw kernel: pci 0000:05:00.0: not ready 1023ms after bus reset; waiting
Jun 23 11:12:17 pve-fw kernel: pci 0000:05:00.0: not ready 2047ms after bus reset; waiting
Jun 23 11:12:19 pve-fw kernel: pci 0000:05:00.0: not ready 4095ms after bus reset; waiting
Jun 23 11:12:23 pve-fw kernel: pci 0000:05:00.0: not ready 8191ms after bus reset; waiting
Jun 23 11:12:32 pve-fw kernel: pci 0000:05:00.0: not ready 16383ms after bus reset; waiting
Jun 23 11:12:48 pve-fw kernel: pci 0000:05:00.0: not ready 32767ms after bus reset; waiting
Jun 23 11:13:04 pve-fw pvedaemon[1479]: <root@pam> successful auth for user 'root@pam'
Jun 23 11:13:22 pve-fw kernel: pci 0000:05:00.0: not ready 65535ms after bus reset; giving up
Jun 23 11:13:23 pve-fw kernel: pci 0000:05:00.0: not ready 1023ms after bus reset; giving up
Jun 23 11:13:25 pve-fw kernel: pci 0000:05:00.0: not ready 1023ms after bus reset; waiting
Jun 23 11:13:26 pve-fw kernel: INFO: task irq/122-aerdrv:91 blocked for more than 122 seconds.
Jun 23 11:13:26 pve-fw kernel:       Tainted: P           O       6.8.4-2-pve #1
Jun 23 11:13:26 pve-fw kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 23 11:13:26 pve-fw kernel: task:irq/122-aerdrv  state:D stack:0     pid:91    tgid:91    ppid:2      flags:0x00004000
Jun 23 11:13:26 pve-fw kernel: Call Trace:
Jun 23 11:13:26 pve-fw kernel:  <TASK>
Jun 23 11:13:26 pve-fw kernel:  __schedule+0x401/0x15e0
Jun 23 11:13:26 pve-fw kernel:  ? netlink_broadcast_filtered+0x17b/0x520
Jun 23 11:13:26 pve-fw kernel:  schedule+0x33/0x110
Jun 23 11:13:26 pve-fw kernel:  schedule_preempt_disabled+0x15/0x30
Jun 23 11:13:26 pve-fw kernel:  __mutex_lock.constprop.0+0x3f8/0x7a0
Jun 23 11:13:26 pve-fw kernel:  __mutex_lock_slowpath+0x13/0x20
Jun 23 11:13:26 pve-fw kernel:  mutex_lock+0x3c/0x50
Jun 23 11:13:26 pve-fw kernel:  report_error_detected+0x28/0x1c0
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_report_normal_detected+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  report_normal_detected+0x16/0x30
Jun 23 11:13:26 pve-fw kernel:  __pci_walk_bus+0x71/0xe0
Jun 23 11:13:26 pve-fw kernel:  pci_walk_bus+0x10/0x20
Jun 23 11:13:26 pve-fw kernel:  pcie_do_recovery+0xd4/0x3d0
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_aer_root_reset+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  aer_process_err_devices+0x17a/0x1c0
Jun 23 11:13:26 pve-fw kernel:  aer_isr+0x1b5/0x1e0
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_irq_thread_fn+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  irq_thread_fn+0x21/0x70
Jun 23 11:13:26 pve-fw kernel:  irq_thread+0xf8/0x1c0
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_irq_thread_dtor+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_irq_thread+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  kthread+0xef/0x120
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_kthread+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  ret_from_fork+0x44/0x70
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_kthread+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  ret_from_fork_asm+0x1b/0x30
Jun 23 11:13:26 pve-fw kernel:  </TASK>

Side note:
When i start the OPNSense vm with the needed NIC's attached, my whole Proxmox installation stops working and i need to reboot the whole server.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!