[SOLVED] pve-kernel-5.13.19-4 and sata hotplug issue

keeka

Active Member
Dec 8, 2019
199
24
38
I have Intel SATA ports configured as AHCI in BIOS with one port designated as hotplug. I use this, connected to an HDD caddy, for hotswap drives as a (non pve storage) backup destination.

Despite some warnings due to incomplete ACPI implementation of this board:
Code:
ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.SPT0._GTF.DSSP], AE_NOT_FOUND (20210331/psargs-330)
ACPI Error: Aborting method \_SB.PCI0.SAT0.SPT0._GTF due to previous error (AE_NOT_FOUND) (20210331/psparse-529)
this setup has worked very reliably for a couple of years and through several PVE versions and many kernel upgrades.

However, after upgrading tp pve-kernel-5.13.19-4-pve, I see a kernel OOPS on trying to remove a drive after use.
My procedure has been insert drive, power on caddy, mount partition, run jobs. Then issue this command to park the drive prior to power off:

echo 1 >/sys/block/sdd/device/delete

This generates the exception below. It also occurs if I just power down the drive without first issuing the prior command.
In either case and after the kernel error, the drive remains visible according to lsblk but nothing can be done with it without a reboot.

If I boot with pve-kernel-5.13.19-3-pve, no such errors. I'm not sure where's best to report this or if the shortcomings of the old board's BIOS's ACPI have finally caught me out.

Code:
Feb 06 20:37:02 pve kernel: BUG: kernel NULL pointer dereference, address: 0000000000000030
Feb 06 20:37:02 pve kernel: #PF: supervisor read access in kernel mode
Feb 06 20:37:02 pve kernel: #PF: error_code(0x0000) - not-present page
Feb 06 20:37:02 pve kernel: PGD 0 P4D 0
Feb 06 20:37:02 pve kernel: Oops: 0000 [#1] SMP PTI
Feb 06 20:37:02 pve kernel: CPU: 5 PID: 1761 Comm: bash Tainted: P           O      5.13.19-4-pve #1
Feb 06 20:37:02 pve kernel: Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z77X-UD5H, BIOS F16j 11/14/2017
Feb 06 20:37:02 pve kernel: RIP: 0010:blk_mq_cancel_work_sync+0x5/0x60
Feb 06 20:37:02 pve kernel: Code: 4c 89 f7 e8 9d 13 00 00 e9 4b ff ff ff 48 8b 45 d0 49 89 85 e8 00 00 00 31 c0 eb a0 b8 ea ff ff ff eb c1 66 90 0f 1f 44 00 00 <48> 8>
Feb 06 20:37:02 pve kernel: RSP: 0018:ffffba0b40e0fb78 EFLAGS: 00010246
Feb 06 20:37:02 pve kernel: RAX: 0000000000000000 RBX: ffff987546ba2100 RCX: 0000000000000286
Feb 06 20:37:02 pve kernel: RDX: ffff9875473f47c0 RSI: ffffffffba9a7766 RDI: 0000000000000000
Feb 06 20:37:02 pve kernel: RBP: ffffba0b40e0fb90 R08: 0000000000000286 R09: ffffba0b40e0fbc0
Feb 06 20:37:02 pve kernel: R10: 0000000000000000 R11: ffff98754f3af948 R12: ffff9875473f4520
Feb 06 20:37:02 pve kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff987544717198
Feb 06 20:37:02 pve kernel: FS:  00007f7e8d133740(0000) GS:ffff987c3fa80000(0000) knlGS:0000000000000000
Feb 06 20:37:02 pve kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 06 20:37:02 pve kernel: CR2: 0000000000000030 CR3: 0000000104f02002 CR4: 00000000001706e0
Feb 06 20:37:02 pve kernel: Call Trace:
Feb 06 20:37:02 pve kernel:  <TASK>
Feb 06 20:37:02 pve kernel:  ? disk_release+0x24/0xa0
Feb 06 20:37:02 pve kernel:  device_release+0x3b/0xa0
Feb 06 20:37:02 pve kernel:  kobject_put+0x94/0x1b0
Feb 06 20:37:02 pve kernel:  put_device+0x13/0x20
Feb 06 20:37:02 pve kernel:  put_disk+0x1b/0x20
Feb 06 20:37:02 pve kernel:  sg_device_destroy+0x54/0x90
Feb 06 20:37:02 pve kernel:  sg_remove_device+0x128/0x170
Feb 06 20:37:02 pve kernel:  device_del+0x138/0x3e0
Feb 06 20:37:02 pve kernel:  ? kobject_put+0xae/0x1b0
Feb 06 20:37:02 pve kernel:  device_unregister+0x1b/0x60
Feb 06 20:37:02 pve kernel:  __scsi_remove_device+0x110/0x150
Feb 06 20:37:02 pve kernel:  sdev_store_delete+0x6b/0xd0
Feb 06 20:37:02 pve kernel:  dev_attr_store+0x17/0x30
Feb 06 20:37:02 pve kernel:  sysfs_kf_write+0x3f/0x50
Feb 06 20:37:02 pve kernel:  kernfs_fop_write_iter+0x13b/0x1d0
Feb 06 20:37:02 pve kernel:  new_sync_write+0x114/0x1a0
Feb 06 20:37:02 pve kernel:  vfs_write+0x1c5/0x260
Feb 06 20:37:02 pve kernel:  ksys_write+0x67/0xe0
Feb 06 20:37:02 pve kernel:  __x64_sys_write+0x1a/0x20
Feb 06 20:37:02 pve kernel:  do_syscall_64+0x61/0xb0
Feb 06 20:37:02 pve kernel:  ? do_syscall_64+0x6e/0xb0
Feb 06 20:37:02 pve kernel:  ? syscall_exit_to_user_mode+0x27/0x50
Feb 06 20:37:02 pve kernel:  ? do_syscall_64+0x6e/0xb0
Feb 06 20:37:02 pve kernel:  ? irqentry_exit+0x19/0x30
Feb 06 20:37:02 pve kernel:  ? exc_page_fault+0x8f/0x170
Feb 06 20:37:02 pve kernel:  ? asm_exc_page_fault+0x8/0x30
Feb 06 20:37:02 pve kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Feb 06 20:37:02 pve kernel: RIP: 0033:0x7f7e8d224f33
Feb 06 20:37:02 pve kernel: Code: 8b 15 61 ef 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3>
Feb 06 20:37:02 pve kernel: RSP: 002b:00007ffed6888a78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
Feb 06 20:37:02 pve kernel: RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f7e8d224f33
Feb 06 20:37:02 pve kernel: RDX: 0000000000000002 RSI: 000055f3de56be50 RDI: 0000000000000001
Feb 06 20:37:02 pve kernel: RBP: 000055f3de56be50 R08: 000000000000000a R09: 0000000000000001
Feb 06 20:37:02 pve kernel: R10: 000055f3de58a640 R11: 0000000000000246 R12: 0000000000000002
Feb 06 20:37:02 pve kernel: R13: 00007f7e8d2f56a0 R14: 0000000000000002 R15: 00007f7e8d2f58a0
Feb 06 20:37:02 pve kernel:  </TASK>
Feb 06 20:37:02 pve kernel: Modules linked in: ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter 8021q garp m>
Feb 06 20:37:02 pve kernel:  blake2b_generic xor zstd_compress raid6_pq usbkbd dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c hid_generic usbhid hid>
Feb 06 20:37:02 pve kernel: CR2: 0000000000000030
Feb 06 20:37:02 pve kernel: ---[ end trace 254dfa2bfb763187 ]---
Feb 06 20:37:02 pve kernel: RIP: 0010:blk_mq_cancel_work_sync+0x5/0x60
Feb 06 20:37:02 pve kernel: Code: 4c 89 f7 e8 9d 13 00 00 e9 4b ff ff ff 48 8b 45 d0 49 89 85 e8 00 00 00 31 c0 eb a0 b8 ea ff ff ff eb c1 66 90 0f 1f 44 00 00 <48> 8>
Feb 06 20:37:02 pve kernel: RSP: 0018:ffffba0b40e0fb78 EFLAGS: 00010246
Feb 06 20:37:02 pve kernel: RAX: 0000000000000000 RBX: ffff987546ba2100 RCX: 0000000000000286
Feb 06 20:37:02 pve kernel: RDX: ffff9875473f47c0 RSI: ffffffffba9a7766 RDI: 0000000000000000
Feb 06 20:37:02 pve kernel: RBP: ffffba0b40e0fb90 R08: 0000000000000286 R09: ffffba0b40e0fbc0
Feb 06 20:37:02 pve kernel: R10: 0000000000000000 R11: ffff98754f3af948 R12: ffff9875473f4520
Feb 06 20:37:02 pve kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff987544717198
Feb 06 20:37:02 pve kernel: FS:  00007f7e8d133740(0000) GS:ffff987c3fa80000(0000) knlGS:0000000000000000
Feb 06 20:37:02 pve kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 06 20:37:02 pve kernel: CR2: 0000000000000030 CR3: 0000000104f02002 CR4: 00000000001706e0
 
The old kernel is still installed. You can manually boot into an older version and then wait for the next PVE update to fix the current problems.
 
The old kernel is still installed. You can manually boot into an older version and then wait for the next PVE update to fix the current problems.

Thanks. I'm using the GRUB_DEFAULT="1>2" option, as suggested by @Struppie in the above thread, for the time being.
I wondered if my suggestion would work or just break things, being the pve-kernel-5.13 metapackage depends on the last kernel. I take it it's not advisable to uninstall the latest kernel.
My thinking was, if I forget about the grub edit (very likely!), then when the new kernel arrives and I reboot, I am back in the problematic kernel. In my use case, the issue does not manifest immediately.
 
Pleased to say this one seems resolved with updated package pve-kernel-5.13.19-4-pve (version 5.13.19-9).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!