PVE Kernel 5.3.18-3 PCI passthrough error - unable to read tail (got 0 bytes)

frotella

New Member
May 8, 2020
8
1
3
Hello, i'm experiencing PCIe passthrough (intel nic) with the latest Proxmox 6.1 and kernel 5.3.18-3 (efi boot)
On 5.3.18-2 everything works as as expected (freebsd/linux/windows vms) but on 5.3.18-3 as soon as i boot a vm with passthrough, i get 'unable to read tail (got 0 bytes)'

Bonus hint request: meanwhile, how to edit /etc/kernel/cmdline fot stick to 5.3.18-2?

Here's a trace
Code:
May 12 06:12:30 pv0-it kernel: [  125.937690] invalid opcode: 0000 [#1] SMP PTI
May 12 06:12:30 pv0-it kernel: [  125.937700] CPU: 2 PID: 3618 Comm: task UPID:pv0-i Tainted: P           O      5.3.18-3-pve #1
May 12 06:12:30 pv0-it kernel: [  125.937717] Hardware name: IBM IBM xSeries High Volume Towers x3100 M4  -[2582K1G]-/00D8867, BIOS -[JQE164AUS-1.07]- 12/09/2013
May 12 06:12:30 pv0-it kernel: [  125.937742] RIP: 0010:free_msi_irqs+0x17b/0x1b0
May 12 06:12:30 pv0-it kernel: [  125.937752] Code: 84 e1 fe ff ff 45 31 f6 eb 11 41 83 c6 01 44 39 73 14 0f 86 ce fe ff ff 8b 7b 10 44 01 f7 e8 6c 1f b8 ff 48 83 78 70 00 74 e0 <0f> 0b 49 8d b5 b0 00 00 00 e8 07 da b8 ff e9 cf fe ff ff 48 8b 78
May 12 06:12:30 pv0-it kernel: [  125.937787] RSP: 0018:ffffb6e915b5bcf8 EFLAGS: 00010286
May 12 06:12:30 pv0-it kernel: [  125.937798] RAX: ffff937df98d8400 RBX: ffff937e0a765d80 RCX: 0000000000000000
May 12 06:12:30 pv0-it kernel: [  125.937812] RDX: 0000000000000000 RSI: 0000000000000024 RDI: ffffffffa5466940
May 12 06:12:30 pv0-it kernel: [  125.937826] RBP: ffffb6e915b5bd28 R08: ffff937e1c001ff0 R09: ffff937e1c002138
May 12 06:12:30 pv0-it kernel: [  125.937840] R10: 0000000000000000 R11: ffffffffa5466948 R12: ffff937e1b68f2c0
May 12 06:12:30 pv0-it kernel: [  125.937854] R13: ffff937e1b68f000 R14: 0000000000000000 R15: fffffffffffffff2
May 12 06:12:30 pv0-it kernel: [  125.937869] FS:  00007ff8dadc21c0(0000) GS:ffff937e1fa80000(0000) knlGS:0000000000000000
May 12 06:12:30 pv0-it kernel: [  125.937884] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 12 06:12:30 pv0-it kernel: [  125.937896] CR2: 000055fcc8db0d3c CR3: 000000081a1e4005 CR4: 00000000001606e0
May 12 06:12:30 pv0-it kernel: [  125.937910] Call Trace:
May 12 06:12:30 pv0-it kernel: [  125.937920]  pci_disable_msi+0xfa/0x120
May 12 06:12:30 pv0-it kernel: [  125.937935]  e1000e_reset_interrupt_capability+0x52/0x60 [e1000e]
May 12 06:12:30 pv0-it kernel: [  125.937951]  e1000_remove+0xb9/0x170 [e1000e]
May 12 06:12:30 pv0-it kernel: [  125.937962]  pci_device_remove+0x3e/0xc0
May 12 06:12:30 pv0-it kernel: [  125.937971]  device_release_driver_internal+0xe0/0x1b0
May 12 06:12:30 pv0-it kernel: [  125.937983]  device_driver_detach+0x14/0x20
May 12 06:12:30 pv0-it kernel: [  125.937993]  unbind_store+0xf9/0x130
May 12 06:12:30 pv0-it kernel: [  125.938001]  drv_attr_store+0x27/0x40
May 12 06:12:30 pv0-it kernel: [  125.938011]  sysfs_kf_write+0x3b/0x40
May 12 06:12:30 pv0-it kernel: [  125.938019]  kernfs_fop_write+0xda/0x1c0
May 12 06:12:30 pv0-it kernel: [  125.938029]  __vfs_write+0x1b/0x40
May 12 06:12:30 pv0-it kernel: [  125.938037]  vfs_write+0xab/0x1b0
May 12 06:12:30 pv0-it kernel: [  125.938045]  ksys_write+0x61/0xe0
May 12 06:12:30 pv0-it kernel: [  125.938052]  __x64_sys_write+0x1a/0x20
May 12 06:12:30 pv0-it kernel: [  125.938062]  do_syscall_64+0x5a/0x130
May 12 06:12:30 pv0-it kernel: [  125.938072]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 12 06:12:30 pv0-it kernel: [  125.938083] RIP: 0033:0x7ff8dafcf471
May 12 06:12:30 pv0-it kernel: [  125.938092] Code: 00 00 75 05 48 83 c4 58 c3 e8 0b 4d ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 8b 05 da ef 00 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 41 54 49 89 d4 55 48
May 12 06:12:30 pv0-it kernel: [  125.938127] RSP: 002b:00007fff4f3f18a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
May 12 06:12:30 pv0-it kernel: [  125.938142] RAX: ffffffffffffffda RBX: 000055e4a69eb260 RCX: 00007ff8dafcf471
May 12 06:12:30 pv0-it kernel: [  125.938156] RDX: 000000000000000c RSI: 000055e4ad84ced0 RDI: 000000000000000d
May 12 06:12:30 pv0-it kernel: [  125.938170] RBP: 000055e4ad84ced0 R08: 0000000000000000 R09: aaaaaaaaaaaaaaab
May 12 06:12:30 pv0-it kernel: [  125.938184] R10: 000055e4ad842458 R11: 0000000000000246 R12: 000000000000000c
May 12 06:12:30 pv0-it kernel: [  125.938198] R13: 000055e4a69eb260 R14: 000000000000000d R15: 000055e4ad84a980
May 12 06:12:30 pv0-it kernel: [  125.938212] Modules linked in: nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter bonding softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp mgag200 drm_vram_helper ttm kvm_intel drm_kms_helper kvm drm i2c_algo_bit fb_sys_fops syscopyarea ipmi_ssif sysfillrect crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sysimgblt cdc_ether aesni_intel usbnet input_leds joydev mii aes_x86_64 crypto_simd cryptd ie31200_edac glue_helper ipmi_si ipmi_devintf mac_hid pcspkr intel_cstate ipmi_msghandler intel_rapl_perf sch_fq vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi vfio_pci sunrpc vfio_virqfd irqbypass vfio_iommu_type1 vfio ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs xor zstd_compress
May 12 06:12:30 pv0-it kernel: [  125.938237]  raid6_pq libcrc32c wmi hid_sunplus hid_generic usbkbd usbmouse usbhid gpio_ich ahci i2c_i801 hid libahci lpc_ich e1000e
May 12 06:12:30 pv0-it kernel: [  125.941666] ---[ end trace 6d6d6578e1c43408 ]---
May 12 06:12:30 pv0-it kernel: [  125.942380] RIP: 0010:free_msi_irqs+0x17b/0x1b0
May 12 06:12:30 pv0-it kernel: [  125.943070] Code: 84 e1 fe ff ff 45 31 f6 eb 11 41 83 c6 01 44 39 73 14 0f 86 ce fe ff ff 8b 7b 10 44 01 f7 e8 6c 1f b8 ff 48 83 78 70 00 74 e0 <0f> 0b 49 8d b5 b0 00 00 00 e8 07 da b8 ff e9 cf fe ff ff 48 8b 78
May 12 06:12:30 pv0-it kernel: [  125.944529] RSP: 0018:ffffb6e915b5bcf8 EFLAGS: 00010286
May 12 06:12:30 pv0-it kernel: [  125.945261] RAX: ffff937df98d8400 RBX: ffff937e0a765d80 RCX: 0000000000000000
May 12 06:12:30 pv0-it kernel: [  125.946018] RDX: 0000000000000000 RSI: 0000000000000024 RDI: ffffffffa5466940
May 12 06:12:30 pv0-it kernel: [  125.946779] RBP: ffffb6e915b5bd28 R08: ffff937e1c001ff0 R09: ffff937e1c002138
May 12 06:12:30 pv0-it kernel: [  125.947509] R10: 0000000000000000 R11: ffffffffa5466948 R12: ffff937e1b68f2c0
May 12 06:12:30 pv0-it kernel: [  125.948240] R13: ffff937e1b68f000 R14: 0000000000000000 R15: fffffffffffffff2
May 12 06:12:30 pv0-it kernel: [  125.948960] FS:  00007ff8dadc21c0(0000) GS:ffff937e1fa80000(0000) knlGS:0000000000000000
May 12 06:12:30 pv0-it kernel: [  125.949698] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 12 06:12:30 pv0-it kernel: [  125.950415] CR2: 000055fcc8db0d3c CR3: 000000081a1e4005 CR4: 00000000001606e0
May 12 06:12:30 pv0-it pvedaemon[3168]: <root@pam> end task UPID:pv0-it:00000E22:00003117:5EBA3E4E:qmstart:100:root@pam: unable to read tail (got 0 bytes)
 
Try updating to 6.2 which ships with a 5.4-based kernel. I remember someone having a similar issue and that kernel fixing it, so worth a try.

To edit the kernel commandline you either put it in /etc/kernel/cmdline as you say or /etc/default/grub if using grub as your bootloader. Don't forget to run pve-efiboot-tool refresh or update-grub (again, depending on what you use as bootloader). Check /proc/cmdline to see if it worked.
 
  • Like
Reactions: frotella
Just updated (two days ago this update wasn't available) and seems that has solved the problem
Solved for now, i'll keep testing. Thank you!
 
  • Like
Reactions: Stefan_R
Try updating to 6.2 which ships with a 5.4-based kernel. I remember someone having a similar issue and that kernel fixing it, so worth a try.

To edit the kernel commandline you either put it in /etc/kernel/cmdline as you say or /etc/default/grub if using grub as your bootloader. Don't forget to run pve-efiboot-tool refresh or update-grub (again, depending on what you use as bootloader). Check /proc/cmdline to see if it worked.
Hi, I am having the same problem when I want to start a normal vm. After issues with the network in my rack the vm wont boot anymore, giving the error message.
Rebooting my nodes is not a solution.
 
Hi, I am having the same problem when I want to start a normal vm. After issues with the network in my rack the vm wont boot anymore, giving the error message.
Rebooting my nodes is not a solution.
Please open a new thread for such issues instead of replying to an old one. We're currently shipping a 5.11 kernel, making the previous response way outdated anyway, so I doubt you're experiencing the *same* issue. When opening a new thread, please provide more details, such as your network configuration ('/etc/network/interfaces'), hardware config ('ip link', 'ip a', etc...) and VM config ('qm config <vmid>').
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!