Tainted Kernel P IO 5.44-2-pve

HiltonT

New Member
Jul 12, 2020
9
1
1
44
Fresh install of 6.2-4 ISO on a Dell R510, freshly updated using the interface to 6.2-10 with the no-subscription updates, ZFS mirror, 1 * Win10 (2004) guest, nd the following error - and when this error occurs, there is a "?" over everything under the Datacenter for this (single) node, the VM and the storage...

Jul 15 18:08:00 qrk-pve-1 systemd[1]: Starting Proxmox VE replication runner...
Jul 15 18:08:01 qrk-pve-1 systemd[1]: pvesr.service: Succeeded.
Jul 15 18:08:01 qrk-pve-1 systemd[1]: Started Proxmox VE replication runner.
Jul 15 18:08:32 qrk-pve-1 kernel: INFO: task lvs:16795 blocked for more than 966 seconds.
Jul 15 18:08:32 qrk-pve-1 kernel: Tainted: P IO 5.4.44-2-pve #1
Jul 15 18:08:32 qrk-pve-1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 15 18:08:32 qrk-pve-1 kernel: lvs D 0 16795 1665 0x00000000
Jul 15 18:08:32 qrk-pve-1 kernel: Call Trace:
Jul 15 18:08:32 qrk-pve-1 kernel: __schedule+0x2e6/0x6f0
Jul 15 18:08:32 qrk-pve-1 kernel: schedule+0x33/0xa0
Jul 15 18:08:32 qrk-pve-1 kernel: schedule_timeout+0x205/0x300
Jul 15 18:08:32 qrk-pve-1 kernel: ? ttwu_do_activate+0x5a/0x70
Jul 15 18:08:32 qrk-pve-1 kernel: wait_for_completion+0xb7/0x140
Jul 15 18:08:32 qrk-pve-1 kernel: ? wake_up_q+0x80/0x80
Jul 15 18:08:32 qrk-pve-1 kernel: __flush_work+0x131/0x1e0
Jul 15 18:08:32 qrk-pve-1 kernel: ? worker_detach_from_pool+0xb0/0xb0
Jul 15 18:08:32 qrk-pve-1 kernel: ? work_busy+0x90/0x90
Jul 15 18:08:32 qrk-pve-1 kernel: __cancel_work_timer+0x115/0x190
Jul 15 18:08:32 qrk-pve-1 kernel: ? exact_lock+0x11/0x20
Jul 15 18:08:32 qrk-pve-1 kernel: ? kobj_lookup+0xec/0x160
Jul 15 18:08:32 qrk-pve-1 kernel: cancel_delayed_work_sync+0x13/0x20
Jul 15 18:08:32 qrk-pve-1 kernel: disk_block_events+0x78/0x80
Jul 15 18:08:32 qrk-pve-1 kernel: __blkdev_get+0x72/0x560
Jul 15 18:08:32 qrk-pve-1 kernel: blkdev_get+0xe0/0x140
Jul 15 18:08:32 qrk-pve-1 kernel: ? blkdev_get_by_dev+0x50/0x50
Jul 15 18:08:32 qrk-pve-1 kernel: blkdev_open+0x87/0xa0
Jul 15 18:08:32 qrk-pve-1 kernel: do_dentry_open+0x143/0x3a0
Jul 15 18:08:32 qrk-pve-1 kernel: vfs_open+0x2d/0x30
Jul 15 18:08:32 qrk-pve-1 kernel: path_openat+0x2e9/0x16f0
Jul 15 18:08:32 qrk-pve-1 kernel: ? aio_read+0xfe/0x150
Jul 15 18:08:32 qrk-pve-1 kernel: ? __do_page_fault+0x250/0x4c0
Jul 15 18:08:32 qrk-pve-1 kernel: do_filp_open+0x93/0x100
Jul 15 18:08:32 qrk-pve-1 kernel: ? __alloc_fd+0x46/0x150
Jul 15 18:08:32 qrk-pve-1 kernel: do_sys_open+0x177/0x280
Jul 15 18:08:32 qrk-pve-1 kernel: ? __x64_sys_io_submit+0xa9/0x190
Jul 15 18:08:32 qrk-pve-1 kernel: __x64_sys_openat+0x20/0x30
Jul 15 18:08:32 qrk-pve-1 kernel: do_syscall_64+0x57/0x190
Jul 15 18:08:32 qrk-pve-1 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jul 15 18:08:32 qrk-pve-1 kernel: RIP: 0033:0x7f3aa52d91ae
Jul 15 18:08:32 qrk-pve-1 kernel: Code: Bad RIP value.
Jul 15 18:08:32 qrk-pve-1 kernel: RSP: 002b:00007ffd43da9b90 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
Jul 15 18:08:32 qrk-pve-1 kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3aa52d91ae
Jul 15 18:08:32 qrk-pve-1 kernel: RDX: 0000000000044000 RSI: 0000555b2ccf7e08 RDI: 00000000ffffff9c
Jul 15 18:08:32 qrk-pve-1 kernel: RBP: 00007ffd43da9cf0 R08: 0000555b2cf21000 R09: 0000000000000000
Jul 15 18:08:32 qrk-pve-1 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd43daae6f
Jul 15 18:08:32 qrk-pve-1 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Jul 15 18:08:32 qrk-pve-1 kernel: INFO: task lvs:32164 blocked for more than 120 seconds.
Jul 15 18:08:32 qrk-pve-1 kernel: Tainted: P IO 5.4.44-2-pve #1
Jul 15 18:08:32 qrk-pve-1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 15 18:08:32 qrk-pve-1 kernel: lvs D 0 32164 1690 0x00000000
Jul 15 18:08:32 qrk-pve-1 kernel: Call Trace:
Jul 15 18:08:32 qrk-pve-1 kernel: __schedule+0x2e6/0x6f0
Jul 15 18:08:32 qrk-pve-1 kernel: ? __switch_to_asm+0x34/0x70
Jul 15 18:08:32 qrk-pve-1 kernel: schedule+0x33/0xa0
Jul 15 18:08:32 qrk-pve-1 kernel: schedule_preempt_disabled+0xe/0x10
Jul 15 18:08:32 qrk-pve-1 kernel: __mutex_lock.isra.10+0x2c9/0x4c0
Jul 15 18:08:32 qrk-pve-1 kernel: __mutex_lock_slowpath+0x13/0x20
Jul 15 18:08:32 qrk-pve-1 kernel: mutex_lock+0x2c/0x30
Jul 15 18:08:32 qrk-pve-1 kernel: disk_block_events+0x31/0x80
Jul 15 18:08:32 qrk-pve-1 kernel: __blkdev_get+0x72/0x560
Jul 15 18:08:32 qrk-pve-1 kernel: blkdev_get+0xe0/0x140
Jul 15 18:08:32 qrk-pve-1 kernel: ? blkdev_get_by_dev+0x50/0x50
Jul 15 18:08:32 qrk-pve-1 kernel: blkdev_open+0x87/0xa0
Jul 15 18:08:32 qrk-pve-1 kernel: do_dentry_open+0x143/0x3a0
Jul 15 18:08:32 qrk-pve-1 kernel: vfs_open+0x2d/0x30
Jul 15 18:08:32 qrk-pve-1 kernel: path_openat+0x2e9/0x16f0
Jul 15 18:08:32 qrk-pve-1 kernel: ? filename_lookup.part.60+0xe0/0x170
Jul 15 18:08:32 qrk-pve-1 kernel: do_filp_open+0x93/0x100
Jul 15 18:08:32 qrk-pve-1 kernel: ? __alloc_fd+0x46/0x150
Jul 15 18:08:32 qrk-pve-1 kernel: do_sys_open+0x177/0x280
Jul 15 18:08:32 qrk-pve-1 kernel: __x64_sys_openat+0x20/0x30
Jul 15 18:08:32 qrk-pve-1 kernel: do_syscall_64+0x57/0x190
Jul 15 18:08:32 qrk-pve-1 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jul 15 18:08:32 qrk-pve-1 kernel: RIP: 0033:0x7fd73d3201ae
Jul 15 18:08:32 qrk-pve-1 kernel: Code: Bad RIP value.
Jul 15 18:08:32 qrk-pve-1 kernel: RSP: 002b:00007ffeb1949350 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
Jul 15 18:08:32 qrk-pve-1 kernel: RAX: ffffffffffffffda RBX: 00007ffeb194aed4 RCX: 00007fd73d3201ae
Jul 15 18:08:32 qrk-pve-1 kernel: RDX: 0000000000044000 RSI: 0000564c06b09b68 RDI: 00000000ffffff9c
Jul 15 18:08:32 qrk-pve-1 kernel: RBP: 00007ffeb19494b0 R08: 0000564c06b76b90 R09: 0000000000000000
Jul 15 18:08:32 qrk-pve-1 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffeb194af22
Jul 15 18:08:32 qrk-pve-1 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

And this error repeats...
 
The error you are seeing is not the "tainted" part, but the "INFO: task lvs:16795 blocked for more than 966 seconds."

This means that a task is waiting for some ressource from the kernel to become available, and that is not happening. Since it's LVS in your case, I'm going to take a guess and say some hard drive is faulty or misconfigured.
 
  • Like
Reactions: HiltonT
I had assumed that this was a failing HDD issue - I'm heading to the location where this server is this afternoon and will be binning the existing HDDs and replacing them all. They are old and small, and I wouldn't be surprised if they were failing - the LVM is on the PERC 630 controller in a RAID-1 array, but as we know, RAID isn't all that good at telling us about failing drives, unlike ZFS can (obviously, ZFS isn't on drives in any RAID configuration).

I'll do a fresh Proxmox install on this box this evening and see how it goes from there.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!