Kernel 5.15.64 breaks my VMs

karypid

Member
Mar 7, 2021
30
8
13
47
Hello all,

I updated from 5.15.53-1-pve to 5.15.64-1-pve and my VMs have stopped working. When starting they UI simply prints an error that "Error: unable to read tail (got 0 bytes)". In /var/log/daemon.log I see:

Code:
Nov 14 21:30:25 pve pve-guests[7004]: start VM 101: UPID:pve:00001B5C:00000C83:6372B370:qmstart:101:root@pam:
Nov 14 21:30:25 pve pve-guests[3422]: <root@pam> starting task UPID:pve:00001B5C:00000C83:6372B370:qmstart:101:root@pam:
Nov 14 21:30:27 pve pvesh[3421]: Starting VM 101 failed: unable to read tail (got 0 bytes)
Nov 14 21:30:27 pve pve-guests[3421]: <root@pam> end task UPID:pve:00000D5E:0000081E:6372B365:startall::root@pam: OK

I have pinned back kernel 5.15.53-1-pve which works fine. It seems like the newer version of AMD graphics driver seems to have a bug?

Anyway, I ran a diff between journalctl -b and journalctl -b -1 to see differences between the messages logged at startup and I found:

  • The newer kernel prints one extra message for Spectre mitigation which does not exist in the older kernel: pve kernel: Spectre V2 : Spectre v2 / SpectreRSB : Filling RSB on VMEXIT
  • The newer kernel logs: Nov 14 21:29:56 pve kernel: event_source amd_iommu_0: hash matches which is not logged in the older kernel
  • When the VM is starting (I have an option to start on boot) the logging changes when the AMD GPU deviates causing an error (note that the VM has PCI passthrough enabled).
This last point is the most interesting probably. Initially, when starting the VM both kernels log:

Code:
Nov 14 09:49:28 pve pvesh[3338]: Starting VM 101
Nov 14 09:49:28 pve pve-guests[3339]: <root@pam> starting task UPID:pve:00001B0E:00000C6D:63720F28:qmstart:101:root@pam:
Nov 14 09:49:28 pve pve-guests[6926]: start VM 101: UPID:pve:00001B0E:00000C6D:63720F28:qmstart:101:root@pam:
Nov 14 09:49:29 pve kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
Nov 14 09:49:29 pve kernel: [drm] PSP is resuming...
Nov 14 09:49:29 pve kernel: [drm] reserve 0xa00000 from 0x82fe000000 for PSP TMR
Nov 14 09:49:29 pve kernel: amdgpu 0000:0e:00.0: amdgpu: RAS: optional ras ta ucode is not available
Nov 14 09:49:29 pve kernel: amdgpu 0000:0e:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Nov 14 09:49:29 pve kernel: amdgpu 0000:0e:00.0: amdgpu: SMU is resuming...
Nov 14 09:49:29 pve kernel: amdgpu 0000:0e:00.0: amdgpu: smu driver if version = 0x0000000e, smu fw if version = 0x00000012, smu fw version = 0x00413700 (65.55.0)
Nov 14 09:49:29 pve kernel: amdgpu 0000:0e:00.0: amdgpu: SMU driver if version not matched
Nov 14 09:49:29 pve kernel: amdgpu 0000:0e:00.0: amdgpu: SMU is resumed successfully!
Nov 14 09:49:29 pve kernel: [drm] DMUB hardware initialized: version=0x02020013
Nov 14 09:49:30 pve kernel: [drm:retrieve_link_cap [amdgpu]] *ERROR* retrieve_link_cap: Read receiver caps dpcd data failed.
Nov 14 09:49:30 pve kernel: [drm] kiq ring mec 2 pipe 1 q 0
Nov 14 09:49:30 pve kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
Nov 14 09:49:30 pve kernel: [drm] JPEG decode initialized successfully.
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: [drm] Cannot find any crtc or sizes
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: amdgpu: finishing device.
Nov 14 09:49:30 pve kernel: amdgpu 0000:0e:00.0: amdgpu: Fail to disable thermal alert!

Up to this point, both kernels print the same messages when starting the VM. It seems to me the AMD driver is releasing the card for use in the VM. But at this point the two kernels deviate.

The 5.15.53 kernel (which works fine) prints:
Code:
Nov 14 09:49:30 pve kernel: [drm] free PSP TMR buffer
Nov 14 09:49:30 pve kernel: [drm] amdgpu: ttm finalized
Nov 14 09:49:30 pve kernel: vfio-pci 0000:0e:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
Nov 14 09:49:30 pve systemd[1]: Created slice qemu.slice.
Nov 14 09:49:30 pve systemd[1]: Started 101.scope.
...

The newer 5.15.64 kernel throws an error:
Code:
Nov 14 21:30:27 pve kernel: BUG: kernel NULL pointer dereference, address: 000000000000013c
Nov 14 21:30:27 pve kernel: #PF: supervisor read access in kernel mode
Nov 14 21:30:27 pve kernel: #PF: error_code(0x0000) - not-present page
Nov 14 21:30:27 pve kernel: PGD 0 P4D 0
Nov 14 21:30:27 pve kernel: Oops: 0000 [#1] SMP NOPTI
Nov 14 21:30:27 pve kernel: CPU: 8 PID: 7004 Comm: task UPID:pve:0 Tainted: P           O      5.15.64-1-pve #1
Nov 14 21:30:27 pve kernel: Hardware name: Gigabyte Technology Co., Ltd. X570S AERO G/X570S AERO G, BIOS F4c 05/12/2022
Nov 14 21:30:27 pve kernel: RIP: 0010:amdgpu_dm_fini+0x184/0x240 [amdgpu]
Nov 14 21:30:27 pve kernel: Code: 01 00 48 85 ff 74 10 e8 5a f2 26 00 48 c7 83 e0 4e 01 00 00 00 00 00 4c 8b 83 48 4f 01 00 4d 85 c0 74 66 48 8b 93 40 3e 01 00 <8b> 82 3c 01 00 0>Nov 14 21:30:27 pve kernel: RSP: 0018:ffffab0b92bd7b58 EFLAGS: 00010282
Nov 14 21:30:27 pve kernel: RAX: 0000000000000000 RBX: ffff9cd247620000 RCX: 00000000810000f4
Nov 14 21:30:27 pve kernel: RDX: 0000000000000000 RSI: 00000000810000f4 RDI: 0000000000000000
Nov 14 21:30:27 pve kernel: RBP: ffffab0b92bd7b70 R08: ffff9cd245746d80 R09: 0000000000000001
Nov 14 21:30:27 pve kernel: R10: ffff9cd250b99600 R11: dead000000000122 R12: 0000000000000006
Nov 14 21:30:27 pve kernel: R13: ffff9cd247635640 R14: 0000000000000005 R15: 0000000000000001
Nov 14 21:30:27 pve kernel: FS:  00007f01da568280(0000) GS:ffff9ce13ec00000(0000) knlGS:0000000000000000
Nov 14 21:30:27 pve kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 14 21:30:27 pve kernel: CR2: 000000000000013c CR3: 000000011e41e000 CR4: 0000000000750ee0
Nov 14 21:30:27 pve kernel: PKRU: 55555554
Nov 14 21:30:27 pve kernel: Call Trace:
Nov 14 21:30:27 pve kernel:  <TASK>
Nov 14 21:30:27 pve kernel:  dm_hw_fini+0x23/0x30 [amdgpu]
Nov 14 21:30:27 pve kernel:  amdgpu_device_fini_hw+0x3c7/0x50b [amdgpu]
Nov 14 21:30:27 pve kernel:  amdgpu_driver_unload_kms+0x69/0x90 [amdgpu]
Nov 14 21:30:27 pve kernel:  amdgpu_pci_remove+0x27/0x50 [amdgpu]
Nov 14 21:30:27 pve kernel:  pci_device_remove+0x3e/0xb0
Nov 14 21:30:27 pve kernel:  __device_release_driver+0x1ab/0x2a0
Nov 14 21:30:27 pve kernel:  device_driver_detach+0x56/0xe0
Nov 14 21:30:27 pve kernel:  unbind_store+0x12a/0x140
Nov 14 21:30:27 pve kernel:  drv_attr_store+0x24/0x40
Nov 14 21:30:27 pve kernel:  sysfs_kf_write+0x3f/0x50
Nov 14 21:30:27 pve kernel:  kernfs_fop_write_iter+0x13f/0x1d0
Nov 14 21:30:27 pve kernel:  new_sync_write+0x114/0x1b0
Nov 14 21:30:27 pve kernel:  vfs_write+0x1d9/0x270
Nov 14 21:30:27 pve kernel:  ksys_write+0x67/0xf0
Nov 14 21:30:27 pve kernel:  __x64_sys_write+0x1a/0x20
Nov 14 21:30:27 pve kernel:  do_syscall_64+0x5c/0xc0
Nov 14 21:30:27 pve kernel:  ? syscall_exit_to_user_mode+0x27/0x50
Nov 14 21:30:27 pve kernel:  ? __x64_sys_newfstat+0x16/0x20
Nov 14 21:30:27 pve kernel:  ? do_syscall_64+0x69/0xc0
Nov 14 21:30:27 pve kernel:  ? generic_file_llseek+0x24/0x30
Nov 14 21:30:27 pve kernel:  ? exit_to_user_mode_prepare+0x37/0x1b0
Nov 14 21:30:27 pve kernel:  ? syscall_exit_to_user_mode+0x27/0x50
Nov 14 21:30:27 pve kernel:  ? __x64_sys_lseek+0x1a/0x20
Nov 14 21:30:27 pve kernel:  ? do_syscall_64+0x69/0xc0
Nov 14 21:30:27 pve kernel:  ? syscall_exit_to_user_mode+0x27/0x50
Nov 14 21:30:27 pve kernel:  ? do_syscall_64+0x69/0xc0
Nov 14 21:30:27 pve kernel:  ? do_syscall_64+0x69/0xc0
Nov 14 21:30:27 pve kernel:  entry_SYSCALL_64_after_hwframe+0x61/0xcb
Nov 14 21:30:27 pve kernel: RIP: 0033:0x7f01da78afb3
Nov 14 21:30:27 pve kernel: Code: 75 05 48 83 c4 58 c3 e8 cb 41 ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff f>Nov 14 21:30:27 pve kernel: RSP: 002b:00007ffcdc93c398 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
Nov 14 21:30:27 pve kernel: RAX: ffffffffffffffda RBX: 0000556068503d40 RCX: 00007f01da78afb3
Nov 14 21:30:27 pve kernel: RDX: 000000000000000c RSI: 0000556068503d40 RDI: 000000000000000e
Nov 14 21:30:27 pve kernel: RBP: 000000000000000c R08: 0000000000000000 R09: 00005560617b53b0
Nov 14 21:30:27 pve kernel: R10: 00005560684f4a98 R11: 0000000000000246 R12: 0000556068502270
Nov 14 21:30:27 pve kernel: R13: 0000556061c3c2a0 R14: 000000000000000e R15: 0000556068502270
Nov 14 21:30:27 pve kernel:  </TASK>
Nov 14 21:30:27 pve kernel: Modules linked in: xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defra>Nov 14 21:30:27 pve kernel:  sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs blake2b_generic xor zst>Nov 14 21:30:27 pve kernel: CR2: 000000000000013c
Nov 14 21:30:27 pve kernel: ---[ end trace 7e553325ae867a60 ]---
Nov 14 21:30:27 pve kernel: RIP: 0010:amdgpu_dm_fini+0x184/0x240 [amdgpu]
Nov 14 21:30:27 pve kernel: Code: 01 00 48 85 ff 74 10 e8 5a f2 26 00 48 c7 83 e0 4e 01 00 00 00 00 00 4c 8b 83 48 4f 01 00 4d 85 c0 74 66 48 8b 93 40 3e 01 00 <8b> 82 3c 01 00 0>Nov 14 21:30:27 pve kernel: RSP: 0018:ffffab0b92bd7b58 EFLAGS: 00010282
Nov 14 21:30:27 pve kernel: RAX: 0000000000000000 RBX: ffff9cd247620000 RCX: 00000000810000f4
Nov 14 21:30:27 pve kernel: RDX: 0000000000000000 RSI: 00000000810000f4 RDI: 0000000000000000
Nov 14 21:30:27 pve kernel: RBP: ffffab0b92bd7b70 R08: ffff9cd245746d80 R09: 0000000000000001
Nov 14 21:30:27 pve kernel: R10: ffff9cd250b99600 R11: dead000000000122 R12: 0000000000000006
Nov 14 21:30:27 pve kernel: R13: ffff9cd247635640 R14: 0000000000000005 R15: 0000000000000001
Nov 14 21:30:27 pve kernel: FS:  00007f01da568280(0000) GS:ffff9ce13ec00000(0000) knlGS:0000000000000000
Nov 14 21:30:27 pve kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 14 21:30:27 pve kernel: CR2: 000000000000013c CR3: 000000011e41e000 CR4: 0000000000750ee0
Nov 14 21:30:27 pve kernel: PKRU: 55555554
Nov 14 21:30:27 pve pvesh[3421]: Starting VM 101 failed: unable to read tail (got 0 bytes)
Nov 14 21:30:27 pve pve-guests[3421]: <root@pam> end task UPID:pve:00000D5E:0000081E:6372B365:startall::root@pam: OK

I used to blacklist drm/amdgpu to make sure the host kernel does not touch them. Newer kernels seem to have no issue with releasing the GPUs, so I decided to remove the blacklisting options (to be able to see the kernel log messages on start). I wonder if I should restore the options to disable any form of logging on the host via the GPU, as perhaps then the issue (which seems to happen when releasing the card) might be worked around?

I hope this helps developers... If anyone wants me to test anything and/or provide logs, please let me know. I'm happy to test as needed...
 
Hi,

I just confirmed the work-around I hypothesized in my post: addint the blacklist configuration, together with the vfio options and the kernel parameters to not log on boot allows me to use 5.15.64.

I suspect that the AMD driver has a bug in its "release GPU" code that was introduced some time after 5.15.53.

EDIT: details on what I changed:

Code:
root@pve:~# cat /etc/modprobe.d/blacklist.conf
blacklist radeon
blacklist nouveau
blacklist nvidia
blacklist amdgpu


root@pve:~# cat /etc/modprobe.d/vfio.conf
# 6700XT: 1462:3982,1002:ab28
# 6800  : 1849:5203,1002:ab28
options vfio-pci ids=1462:3982,1849:5203,1002:ab28 disable_idle_d3=1 disable_vga=1
 
Last edited:
I'm very grateful you posted this, thank you.
I upgraded yesterday to the system's proposed kernel 5.15.83-1 and my VM with passed-through amd gpu would not start. I am on pve 7.3-4.
I am unsure if there were two upgrades as my list of kernels show 5.15.74-1 with date of 14th of November 2022. I rebooted on it and still have the same problem.
Next for me is to try with the previous kernel 5.15.39-3 from July 2022 that I am almost certain is the one that has been fine until now. My GPU passthrough has worked "out of the box". I don't know how to configure my system otherwise.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!