kernel 6.5.11-8-pve issue with task:pvescheduler

tsimblist

Active Member
Jun 25, 2020
13
6
43
69
I applied updates to some of the nodes in my Proxmox cluster this morning. This included the new kernel 6.5.11-8-pve. I rebooted the third server and it came back up with some issues. It didn't seem to have reported in with the Proxmox cluster. There was a little red x next to the Server in the GUI. However, it is also a Ceph node and Ceph reported that all the OSDs came back online and the Ceph database was healthy.

The following eventually appeared on my serial console:
Code:
xeon1230v2 login: [  244.159592] INFO: task pvescheduler:8378 blocked for more .
[  244.166598]       Tainted: P           O       6.5.11-8-pve #1               
[  244.172468] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this.
[  244.180330] task:pvescheduler    state:D stack:0     pid:8378  ppid:8377   f6
[  244.188708] Call Trace:                                                     
[  244.191168]  <TASK>                                                         
[  244.193358]  __schedule+0x3fd/0x1450                                         
[  244.196961]  ? try_to_unlazy+0x60/0xe0                                       
[  244.200775]  ? terminate_walk+0x65/0x100                                     
[  244.204791]  ? path_parentat+0x49/0x90                                       
[  244.208563]  schedule+0x63/0x110                                             
[  244.211862]  schedule_preempt_disabled+0x15/0x30                             
[  244.216558]  rwsem_down_write_slowpath+0x392/0x6a0                           
[  244.221375]  down_write+0x5c/0x80                                           
[  244.224780]  filename_create+0xaf/0x1b0                                     
[  244.228697]  do_mkdirat+0x5d/0x170                                           
[  244.232223]  __x64_sys_mkdir+0x4a/0x70                                       
[  244.236074]  do_syscall_64+0x5b/0x90                                         
[  244.239793]  ? __x64_sys_alarm+0x76/0xd0                                     
[  244.243807]  ? exit_to_user_mode_prepare+0x39/0x190                         
[  244.248897]  ? syscall_exit_to_user_mode+0x37/0x60                           
[  244.253833]  ? do_syscall_64+0x67/0x90                                       
[  244.257618]  ? exit_to_user_mode_prepare+0x39/0x190                         
[  244.262575]  ? irqentry_exit_to_user_mode+0x17/0x20                         
[  244.267489]  ? irqentry_exit+0x43/0x50                                       
[  244.271316]  ? exc_page_fault+0x94/0x1b0                                     
[  244.275302]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8                       
[  244.280554] RIP: 0033:0x7f3ba6179e27                                         
[  244.284181] RSP: 002b:00007ffc6b732c38 EFLAGS: 00000246 ORIG_RAX: 00000000003
[  244.291803] RAX: ffffffffffffffda RBX: 0000561f6f19f2a0 RCX: 00007f3ba6179e27
[  244.298949] RDX: 0000000000000026 RSI: 00000000000001ff RDI: 0000561f6f1d3ee0
[  244.306144] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[  244.313299] R10: 0000000000000000 R11: 0000000000000246 R12: 0000561f6f1a4c88
[  244.320478] R13: 0000561f6f1d3ee0 R14: 0000561f6f444768 R15: 00000000000001ff
[  244.327783]  </TASK>

Other servers in the cluster (7 nodes total) started complaining about various issues and the red x started to propagate to them whether they were running the new kernel or not. Yet Ceph remained healthy.

I finally restored order by rebooting each server starting with the three that had the new kernel and rolling them back to 6.5.11-7-pve. Then rebooting the other four servers that I hadn't yet updated. And everything was fine again.

Too much excitement for one morning!
 
Package versions

Code:
proxmox-ve: 8.1.0 (running kernel: 6.5.11-7-pve)
pve-manager: 8.1.4 (running version: 8.1.4/ec5affc9e41f1d79)
proxmox-kernel-helper: 8.1.0
pve-kernel-6.2: 8.0.5
proxmox-kernel-6.5: 6.5.11-8
proxmox-kernel-6.5.11-8-pve-signed: 6.5.11-8
proxmox-kernel-6.5.11-7-pve-signed: 6.5.11-7
proxmox-kernel-6.5.11-6-pve-signed: 6.5.11-6
pve-kernel-5.4: 6.4-4
proxmox-kernel-6.2.16-20-pve: 6.2.16-20
proxmox-kernel-6.2: 6.2.16-20
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.4.124-1-pve: 5.4.124-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph: 17.2.7-pve2
ceph-fuse: 17.2.7-pve2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: not correctly installed
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.1.0
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.5
libpve-network-perl: 0.9.5
libpve-rs-perl: 0.8.8
libpve-storage-perl: 8.0.5
libqb0: 1.0.5-1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve4
novnc-pve: 1.4.0-3
openvswitch-switch: 3.1.0-2
proxmox-backup-client: 3.1.3-1
proxmox-backup-file-restore: 3.1.3-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.4
proxmox-widget-toolkit: 4.1.3
pve-cluster: 8.0.5
pve-container: 5.0.8
pve-docs: 8.1.3
pve-edk2-firmware: 4.2023.08-3
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.2.0
pve-qemu-kvm: 8.1.2-6
pve-xtermjs: 5.3.0-3
qemu-server: 8.0.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.2-pve1
 
Hi,
please check the system logs/journal from around the time the issue happened.

Code:
[  244.228697]  do_mkdirat+0x5d/0x170
So it was stuck in a mkdir call. Are you doing backups to CephFS?
 
So it was stuck in a mkdir call. Are you doing backups to CephFS?
No, I use Proxmox Backup Server. However, the PBS was offline when this issue presented itself.

I have attached an excerpt from the syslog from about two minutes before the issue was reported on the serial console until I forced a reboot with a machine reset. I had issued a reboot command to the server but it was not going down in a timely fashion, so I forced the issue.
 

Attachments

Last edited:
Code:
Feb 03 05:34:52 xeon1230v2 pmxcfs[2976]: [dcdb] notice: members: 1/2976, 2/1370, 4/1483, 5/2229, 6/3019, 7/3137
Feb 03 05:34:52 xeon1230v2 corosync[3148]:   [QUORUM] Members[7]: 1 2 3 4 5 6 7
Feb 03 05:34:52 xeon1230v2 corosync[3148]:   [MAIN  ] Completed service synchronization, ready to provide service.
Corosync quorum was reached here.
Code:
Feb 03 05:34:52 xeon1230v2 pvestatd[3675]: pbackups: error fetching datastores - 500 Can't connect to 192.168.1.36:8007 (No route to host)
Feb 03 05:34:52 xeon1230v2 pmxcfs[2976]: [dcdb] notice: cpg_send_message retried 1 times
Feb 03 05:34:52 xeon1230v2 pmxcfs[2976]: [dcdb] notice: members: 1/2976, 2/1370, 3/1429, 4/1483, 5/2229, 6/3019, 7/3137
Feb 03 05:34:52 xeon1230v2 pmxcfs[2976]: [dcdb] notice: queue not emtpy - resening 29 messages
Feb 03 05:34:52 xeon1230v2 pmxcfs[2976]: [status] notice: members: 1/2976, 2/1370, 4/1483, 5/2229, 6/3019, 7/3137
Feb 03 05:34:52 xeon1230v2 pmxcfs[2976]: [status] notice: members: 1/2976, 2/1370, 3/1429, 4/1483, 5/2229, 6/3019, 7/3137
Feb 03 05:34:52 xeon1230v2 pmxcfs[2976]: [status] notice: queue not emtpy - resening 3424 messages
Feb 03 05:34:52 xeon1230v2 pmxcfs[2976]: [dcdb] notice: received sync request (epoch 1/2976/00000004)
Feb 03 05:34:52 xeon1230v2 pmxcfs[2976]: [dcdb] notice: received sync request (epoch 1/2976/00000005)
Feb 03 05:34:52 xeon1230v2 pmxcfs[2976]: [status] notice: received sync request (epoch 1/2976/00000004)
Feb 03 05:34:52 xeon1230v2 pmxcfs[2976]: [status] notice: received sync request (epoch 1/2976/00000005)
Feb 03 05:34:54 xeon1230v2 corosync[3148]:   [TOTEM ] Retransmit List: 9e 9f 9d
Feb 03 05:35:02 xeon1230v2 pvestatd[3675]: pbackups: error fetching datastores - 500 Can't connect to 192.168.1.36:8007 (No route to host)
Feb 03 05:35:12 xeon1230v2 pvestatd[3675]: pbackups: error fetching datastores - 500 Can't connect to 192.168.1.36:8007 (No route to host)
Code:
Feb 03 05:35:15 xeon1230v2 corosync[3148]:   [KNET  ] link: host: 3 link: 0 is down
Feb 03 05:35:15 xeon1230v2 corosync[3148]:   [KNET  ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 03 05:35:15 xeon1230v2 corosync[3148]:   [KNET  ] host: host: 3 has no active links
Feb 03 05:35:19 xeon1230v2 corosync[3148]:   [KNET  ] rx: host: 3 link: 0 is up
Feb 03 05:35:19 xeon1230v2 corosync[3148]:   [KNET  ] link: Resetting MTU for link 0 because host 3 joined
Feb 03 05:35:19 xeon1230v2 corosync[3148]:   [KNET  ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 03 05:35:19 xeon1230v2 corosync[3148]:   [KNET  ] pmtud: Global data MTU changed to: 1397
But apparently the link to host 3 was a bit flaky.
Code:
Feb 03 05:35:22 xeon1230v2 pvestatd[3675]: pbackups: error fetching datastores - 500 Can't connect to 192.168.1.36:8007 (No route to host)
Feb 03 05:35:32 xeon1230v2 pvestatd[3675]: pbackups: error fetching datastores - 500 Can't connect to 192.168.1.36:8007 (No route to host)
Feb 03 05:35:43 xeon1230v2 pvestatd[3675]: pbackups: error fetching datastores - 500 Can't connect to 192.168.1.36:8007 (No route to host)
Feb 03 05:35:52 xeon1230v2 pvestatd[3675]: pbackups: error fetching datastores - 500 Can't connect to 192.168.1.36:8007 (No route to host)
Feb 03 05:36:01 xeon1230v2 systemd-logind[2288]: The system will reboot now!
And the reboot was issued just one minute after that. Just a guess, but maybe Corosync and the cluster filesystem didn't have time yet to come up fully? And maybe pvescheduler was waiting on the cluster filesystem.

Is the cluster network for corosync a dedicated network or shared with Ceph? If the latter, that can cause issues, see https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_cluster_network
 
Is the cluster network for corosync a dedicated network or shared with Ceph?
There is a separate vlan for Ceph traffic. But corosync does share the default network with everything else. This is a homelab with low traffic volume and it has not been a problem until now.

But apparently the link to host 3 was a bit flaky.
Nodes 1, 2 & 3 (see below) were all running 6.5.11-8-pve at that point. Node 3 was the first to be updated and rebooted, then node 2 and finally node 1. The Ceph servers are nodes 1, 6 and 7.

Code:
root@xeon1230v2:~# pvecm nodes

Membership information
----------------------
    Nodeid      Votes Name
         1          1 xeon1230v2 (local)
         2          1 optiplex3050
         3          1 z440
         4          1 epyc3000
         5          1 epyc3251
         6          1 xeon1220
         7          1 e3-1230v2
root@xeon1230v2:~#

Attached is a copy of the syslog for node 3 from first boot after 6.5.11-8-pve until the forced reboot after it became unstable. It looks like it had the same issue with task pvescheduler. Possibly provoked by node 1?
 

Attachments

I created a new vlan so I could implement a separate cluster network. I used my nested PVE test cluster (virtual machines) to learn how to change the cluster network configuration. Once I had that figured out, I made the configuration changes to my bare metal cluster and verified it was working properly.

This morning I tried rebooting 6.5.11-8-pve and there was no trouble found. So I finished updating the remaining nodes and rebooting.

All is well.

Thank you for your support.
 
I created a new vlan so I could implement a separate cluster network.
Is that VLAN somehow prioritized by your switches? Otherwise, latency will still be affected by other traffic on the same physical interface(s).

This morning I tried rebooting 6.5.11-8-pve and there was no trouble found. So I finished updating the remaining nodes and rebooting.
Glad to hear :)
 
Is that VLAN somehow prioritized by your switches? Otherwise, latency will still be affected by other traffic on the same physical interface(s).
I suspected that might be the case. I did a little checking and there does not seem to be any QoS option for VLANs on my UniFi switches.

So, it was a good learning exercise, but I haven't actually fixed anything yet. I am now contemplating an 8 port switch for a truly private network.

Thanks again.