Passthrough PCIe 4x 2.5gbe card not fully working

Graxo

New Member
Jun 23, 2024
24
0
1
Hello,

Im in need of some help, i hope someone has experience with my issue.

I cant seem to passthrough 2 nics to my OPNSense VM. I tried mapping them and tried it without mapping them. The PCIe card has 4 nics. 2 of them are in a bond (bond0) for VM traffic (this works fine). The management nic is one on the mobo, also works fine. So the issue is in the passthrough of the 2 nics for OPNSense vm.
If there is more info needed, please let me know.

Server Hardware:
Mobo: PRIME H610I-PLUS D4-CSM (Bios ver. 3212)
CPU: Intel® Core™ i3-12100 Processor
Nic: KALEA-INFORMATIQUE PCIe-card 2.5 x4 LAN Gigabit (https://www.amazon.nl/dp/B0BJW3H962)
IOMMU and/or Vt-d is enabled in the BIOS.

Logging when starting the vm:
Code:
Jun 23 09:59:36 pve-fw pvedaemon[2233]: start VM 103: UPID:pve-fw:000008B9:00002AB8:6677D5E8:qmstart:103:root@pam:
Jun 23 09:59:36 pve-fw kernel: igc 0000:05:00.0 enp5s0: PHC removed
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0: AER: Multiple Correctable error message received from 0000:02:03.0
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0:   device [8086:460d] error status/mask=00008000/00002000
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0:    [15] HeaderOF
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:02:03.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:02:03.0:   device [12d8:2608] error status/mask=00000040/00002000
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:02:03.0:    [ 6] BadTLP
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:02:03.0: AER:   Error of this Agent is reported first
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0: AER: Multiple Uncorrectable (Non-Fatal) error message received from 0000:00:01.0
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Requester ID)
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0:   device [8086:460d] error status/mask=00100000/00010000
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0:    [20] UnsupReq               (First)
Jun 23 09:59:36 pve-fw kernel: pcieport 0000:00:01.0: AER:   TLP Header: 30000000 02180032 00000000 00000000

INFO: task irq/122-aerdrv:92 blocked for more than 122 seconds.
Jun 23 10:01:54 pve-fw kernel:       Tainted: P           O       6.8.4-2-pve #1
Jun 23 10:01:54 pve-fw kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 23 10:01:54 pve-fw kernel: task:irq/122-aerdrv  state:D stack:0     pid:92    tgid:92    ppid:2      flags:0x00004000
Jun 23 10:01:54 pve-fw kernel: Call Trace:
Jun 23 10:01:54 pve-fw kernel:  <TASK>
Jun 23 10:01:54 pve-fw kernel:  __schedule+0x401/0x15e0
Jun 23 10:01:54 pve-fw kernel:  schedule+0x33/0x110
Jun 23 10:01:54 pve-fw kernel:  schedule_preempt_disabled+0x15/0x30
Jun 23 10:01:54 pve-fw kernel:  __mutex_lock.constprop.0+0x3f8/0x7a0
Jun 23 10:01:54 pve-fw kernel:  __mutex_lock_slowpath+0x13/0x20
Jun 23 10:01:54 pve-fw kernel:  mutex_lock+0x3c/0x50
Jun 23 10:01:54 pve-fw kernel:  report_slot_reset+0x23/0xa0
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_report_slot_reset+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  __pci_walk_bus+0x71/0xe0
Jun 23 10:01:54 pve-fw kernel:  pci_walk_bus+0x10/0x20
Jun 23 10:01:54 pve-fw kernel:  pcie_do_recovery+0x20b/0x3d0
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_aer_root_reset+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  aer_process_err_devices+0x17a/0x1c0
Jun 23 10:01:54 pve-fw kernel:  aer_isr+0x1b5/0x1e0
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_irq_thread_fn+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  irq_thread_fn+0x21/0x70
Jun 23 10:01:54 pve-fw kernel:  irq_thread+0xf8/0x1c0
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_irq_thread_dtor+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_irq_thread+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  kthread+0xef/0x120
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_kthread+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  ret_from_fork+0x44/0x70
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_kthread+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  ret_from_fork_asm+0x1b/0x30
Jun 23 10:01:54 pve-fw kernel:  </TASK>
Jun 23 10:01:54 pve-fw kernel: INFO: task kworker/2:2:396 blocked for more than 122 seconds.
Jun 23 10:01:54 pve-fw kernel:       Tainted: P           O       6.8.4-2-pve #1
Jun 23 10:01:54 pve-fw kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 23 10:01:54 pve-fw kernel: task:kworker/2:2     state:D stack:0     pid:396   tgid:396   ppid:2      flags:0x00004000
Jun 23 10:01:54 pve-fw kernel: Workqueue: events linkwatch_event
Jun 23 10:01:54 pve-fw kernel: Call Trace:
Jun 23 10:01:54 pve-fw kernel:  <TASK>
Jun 23 10:01:54 pve-fw kernel:  __schedule+0x401/0x15e0
Jun 23 10:01:54 pve-fw kernel:  schedule+0x33/0x110
Jun 23 10:01:54 pve-fw kernel:  schedule_preempt_disabled+0x15/0x30
Jun 23 10:01:54 pve-fw kernel:  __mutex_lock.constprop.0+0x3f8/0x7a0
Jun 23 10:01:54 pve-fw kernel:  ? finish_task_switch.isra.0+0x8c/0x310
Jun 23 10:01:54 pve-fw kernel:  ? add_timer_on+0xf9/0x150
Jun 23 10:01:54 pve-fw kernel:  __mutex_lock_slowpath+0x13/0x20
Jun 23 10:01:54 pve-fw kernel:  mutex_lock+0x3c/0x50
Jun 23 10:01:54 pve-fw kernel:  rtnl_lock+0x15/0x20
Jun 23 10:01:54 pve-fw kernel:  linkwatch_event+0x12/0x40
Jun 23 10:01:54 pve-fw kernel:  process_one_work+0x16a/0x350
Jun 23 10:01:54 pve-fw kernel:  worker_thread+0x306/0x440
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_worker_thread+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  kthread+0xef/0x120
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_kthread+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  ret_from_fork+0x44/0x70
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_kthread+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  ret_from_fork_asm+0x1b/0x30
Jun 23 10:01:54 pve-fw kernel:  </TASK>
Jun 23 10:01:54 pve-fw kernel: INFO: task pvestatd:1468 blocked for more than 122 seconds.
Jun 23 10:01:54 pve-fw kernel:       Tainted: P           O       6.8.4-2-pve #1
Jun 23 10:01:54 pve-fw kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 23 10:01:54 pve-fw kernel: task:pvestatd        state:D stack:0     pid:1468  tgid:1468  ppid:1      flags:0x00000002
Jun 23 10:01:54 pve-fw kernel: Call Trace:
Jun 23 10:01:54 pve-fw kernel:  <TASK>
Jun 23 10:01:54 pve-fw kernel:  __schedule+0x401/0x15e0
Jun 23 10:01:54 pve-fw kernel:  schedule+0x33/0x110
Jun 23 10:01:54 pve-fw kernel:  schedule_preempt_disabled+0x15/0x30
Jun 23 10:01:54 pve-fw kernel:  __mutex_lock.constprop.0+0x3f8/0x7a0
Jun 23 10:01:54 pve-fw kernel:  ? filemap_get_read_batch+0x149/0x280
Jun 23 10:01:54 pve-fw kernel:  __mutex_lock_slowpath+0x13/0x20
Jun 23 10:01:54 pve-fw kernel:  mutex_lock+0x3c/0x50
Jun 23 10:01:54 pve-fw kernel:  __netlink_dump_start+0x76/0x2a0
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_rtnl_dump_all+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  rtnetlink_rcv_msg+0x280/0x3c0
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_rtnl_dump_all+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
Jun 23 10:01:54 pve-fw kernel:  netlink_rcv_skb+0x5a/0x110
Jun 23 10:01:54 pve-fw kernel:  rtnetlink_rcv+0x15/0x30
Jun 23 10:01:54 pve-fw kernel:  netlink_unicast+0x1b0/0x2a0
Jun 23 10:01:54 pve-fw kernel:  netlink_sendmsg+0x214/0x470
Jun 23 10:01:54 pve-fw kernel:  __sys_sendto+0x21b/0x230
Jun 23 10:01:54 pve-fw kernel:  __x64_sys_sendto+0x24/0x40
Jun 23 10:01:54 pve-fw kernel:  do_syscall_64+0x84/0x180
Jun 23 10:01:54 pve-fw kernel:  ? do_syscall_64+0x93/0x180
Jun 23 10:01:54 pve-fw kernel:  ? irqentry_exit+0x43/0x50
Jun 23 10:01:54 pve-fw kernel:  ? exc_page_fault+0x94/0x1b0
Jun 23 10:01:54 pve-fw kernel:  entry_SYSCALL_64_after_hwframe+0x73/0x7b
Jun 23 10:01:54 pve-fw kernel: RIP: 0033:0x78eaa9b35b93
Jun 23 10:01:54 pve-fw kernel: RSP: 002b:00007ffffaa0dfe8 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
Jun 23 10:01:54 pve-fw kernel: RAX: ffffffffffffffda RBX: 00007ffffaa0f0d0 RCX: 000078eaa9b35b93
Jun 23 10:01:54 pve-fw kernel: RDX: 0000000000000014 RSI: 00007ffffaa0f0d0 RDI: 0000000000000008
Jun 23 10:01:54 pve-fw kernel: RBP: 00007ffffaa0f120 R08: 00007ffffaa0f074 R09: 000000000000000c
Jun 23 10:01:54 pve-fw kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000008
Jun 23 10:01:54 pve-fw kernel: R13: 00007ffffaa0f1c8 R14: 00007ffffaa0f1d0 R15: 00005b11b26142a0
Jun 23 10:01:54 pve-fw kernel:  </TASK>

File: /etc/default/grub
Code:
root@pve-fw:~# cat /etc/default/grub
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_port_pm=off pcie_acs_override=downstream,multifunction"
GRUB_CMDLINE_LINUX=""

Changed:
From: GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_port_pm=off pcie_acs_override=downstream,multifunction"
To: GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"

Code:
Jun 23 11:11:00 pve-fw pvedaemon[2107]: start VM 103: UPID:pve-fw:0000083B:00002772:6677E6A4:qmstart:103:root@pam:
Jun 23 11:11:00 pve-fw pvedaemon[1480]: <root@pam> starting task UPID:pve-fw:0000083B:00002772:6677E6A4:qmstart:103:root@pam:
Jun 23 11:11:00 pve-fw kernel: igc 0000:05:00.0 enp5s0: PHC removed
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0: AER: Multiple Correctable error message received from 0000:02:03.0
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0:   device [8086:460d] error status/mask=0000a000/00002000
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0:    [15] HeaderOF
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Transmitter ID)
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0:   device [12d8:2608] error status/mask=00001141/00002000
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0:    [ 0] RxErr                  (First)
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0:    [ 6] BadTLP
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0:    [ 8] Rollover
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0:    [12] Timeout
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0: AER:   Error of this Agent is reported first
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0: AER: Multiple Uncorrectable (Non-Fatal) error message received from 0000:00:01.0
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Requester ID)
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0:   device [8086:460d] error status/mask=00100000/00010000
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0:    [20] UnsupReq               (First)
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:00:01.0: AER:   TLP Header: 30000000 02180032 00000000 00000000
Jun 23 11:11:00 pve-fw kernel: pcieport 0000:02:03.0: pciehp: Slot(0-2): Link Down
Jun 23 11:11:00 pve-fw kernel: pci 0000:05:00.0: Unable to change power state from unknown to D0, device inaccessible
Jun 23 11:11:00 pve-fw kernel: bond0: (slave enp3s0): link status definitely down, disabling slave
Jun 23 11:11:01 pve-fw kernel: pci 0000:05:00.0: timed out waiting for pending transaction; performing function level reset anyway
Jun 23 11:11:03 pve-fw kernel: pci 0000:05:00.0: not ready 1023ms after FLR; waiting
Jun 23 11:11:04 pve-fw kernel: pci 0000:05:00.0: not ready 2047ms after FLR; waiting
Jun 23 11:11:06 pve-fw kernel: pci 0000:05:00.0: not ready 4095ms after FLR; waiting
Jun 23 11:11:11 pve-fw kernel: pci 0000:05:00.0: not ready 8191ms after FLR; waiting
Jun 23 11:11:19 pve-fw kernel: pci 0000:05:00.0: not ready 16383ms after FLR; waiting
Jun 23 11:11:36 pve-fw kernel: pci 0000:05:00.0: not ready 32767ms after FLR; waiting
Jun 23 11:12:12 pve-fw kernel: pci 0000:05:00.0: not ready 65535ms after FLR; giving up
Jun 23 11:12:13 pve-fw kernel: pci 0000:05:00.0: not ready 1023ms after bus reset; giving up
Jun 23 11:12:15 pve-fw kernel: pci 0000:05:00.0: not ready 1023ms after bus reset; waiting
Jun 23 11:12:17 pve-fw kernel: pci 0000:05:00.0: not ready 2047ms after bus reset; waiting
Jun 23 11:12:19 pve-fw kernel: pci 0000:05:00.0: not ready 4095ms after bus reset; waiting
Jun 23 11:12:23 pve-fw kernel: pci 0000:05:00.0: not ready 8191ms after bus reset; waiting
Jun 23 11:12:32 pve-fw kernel: pci 0000:05:00.0: not ready 16383ms after bus reset; waiting
Jun 23 11:12:48 pve-fw kernel: pci 0000:05:00.0: not ready 32767ms after bus reset; waiting
Jun 23 11:13:04 pve-fw pvedaemon[1479]: <root@pam> successful auth for user 'root@pam'
Jun 23 11:13:22 pve-fw kernel: pci 0000:05:00.0: not ready 65535ms after bus reset; giving up
Jun 23 11:13:23 pve-fw kernel: pci 0000:05:00.0: not ready 1023ms after bus reset; giving up
Jun 23 11:13:25 pve-fw kernel: pci 0000:05:00.0: not ready 1023ms after bus reset; waiting
Jun 23 11:13:26 pve-fw kernel: INFO: task irq/122-aerdrv:91 blocked for more than 122 seconds.
Jun 23 11:13:26 pve-fw kernel:       Tainted: P           O       6.8.4-2-pve #1
Jun 23 11:13:26 pve-fw kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 23 11:13:26 pve-fw kernel: task:irq/122-aerdrv  state:D stack:0     pid:91    tgid:91    ppid:2      flags:0x00004000
Jun 23 11:13:26 pve-fw kernel: Call Trace:
Jun 23 11:13:26 pve-fw kernel:  <TASK>
Jun 23 11:13:26 pve-fw kernel:  __schedule+0x401/0x15e0
Jun 23 11:13:26 pve-fw kernel:  ? netlink_broadcast_filtered+0x17b/0x520
Jun 23 11:13:26 pve-fw kernel:  schedule+0x33/0x110
Jun 23 11:13:26 pve-fw kernel:  schedule_preempt_disabled+0x15/0x30
Jun 23 11:13:26 pve-fw kernel:  __mutex_lock.constprop.0+0x3f8/0x7a0
Jun 23 11:13:26 pve-fw kernel:  __mutex_lock_slowpath+0x13/0x20
Jun 23 11:13:26 pve-fw kernel:  mutex_lock+0x3c/0x50
Jun 23 11:13:26 pve-fw kernel:  report_error_detected+0x28/0x1c0
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_report_normal_detected+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  report_normal_detected+0x16/0x30
Jun 23 11:13:26 pve-fw kernel:  __pci_walk_bus+0x71/0xe0
Jun 23 11:13:26 pve-fw kernel:  pci_walk_bus+0x10/0x20
Jun 23 11:13:26 pve-fw kernel:  pcie_do_recovery+0xd4/0x3d0
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_aer_root_reset+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  aer_process_err_devices+0x17a/0x1c0
Jun 23 11:13:26 pve-fw kernel:  aer_isr+0x1b5/0x1e0
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_irq_thread_fn+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  irq_thread_fn+0x21/0x70
Jun 23 11:13:26 pve-fw kernel:  irq_thread+0xf8/0x1c0
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_irq_thread_dtor+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_irq_thread+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  kthread+0xef/0x120
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_kthread+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  ret_from_fork+0x44/0x70
Jun 23 11:13:26 pve-fw kernel:  ? __pfx_kthread+0x10/0x10
Jun 23 11:13:26 pve-fw kernel:  ret_from_fork_asm+0x1b/0x30
Jun 23 11:13:26 pve-fw kernel:  </TASK>

Side note:
When i start the OPNSense vm with the needed NIC's attached, my whole Proxmox installation stops working and i need to reboot the whole server.
 
Last edited:
I was able to address a similar issue with 2-port i-225v PCIe NIC by updating the GRUB_CMDLINE_LINUX_DEFAULT option under the /etc/default/grub file with the following options: pcie_aspm=off pci=noaer
 
  • Like
Reactions: UdoB