Hi,
I upgraded our cluster to 7.2.4 (comming from last 7.1) lately,
the /var/log/messages was filling up with kernel crash reports :
we use a ceph pool, zfs replication, cgroup-v1
the bug was present only on the computing hosts
Using kernel 5.13.19-6-pve instead of 5.15.35-1-pve solved the issue.
I upgraded our cluster to 7.2.4 (comming from last 7.1) lately,
the /var/log/messages was filling up with kernel crash reports :
Code:
May 19 17:00:05 pve01 kernel: [16498.218023] <TASK>
May 19 17:00:05 pve01 kernel: [16498.266312] kthread+0x127/0x150
May 19 17:00:05 pve01 kernel: [16498.271642] ? set_kthread_struct+0x50/0x50
May 19 17:00:05 pve01 kernel: [16498.278014] ? throttle_active_work+0xe2/0x1f0
May 19 17:00:05 pve01 kernel: [16498.304283] kthread+0x127/0x150
May 19 17:00:06 pve01 kernel: [16499.239798] process_one_work+0x228/0x3d0
May 19 17:00:06 pve01 kernel: [16499.337357] <TASK>
May 19 17:00:06 pve01 kernel: [16499.341011] ? throttle_active_work+0xe2/0x1f0
May 19 17:00:06 pve01 kernel: [16499.350852] ? process_one_work+0x3d0/0x3d0
May 19 17:00:07 pve01 kernel: [16500.325987] Call Trace:
May 19 17:00:07 pve01 kernel: [16500.329407] kthread+0x127/0x150
May 19 17:00:07 pve01 kernel: [16500.330798] ? process_one_work+0x3d0/0x3d0
May 19 17:00:08 pve01 kernel: [16501.350924] <TASK>
May 19 17:00:08 pve01 kernel: [16501.354426] ? set_kthread_struct+0x50/0x50
May 19 17:00:08 pve01 kernel: [16501.354885] <TASK>
May 19 17:00:09 pve01 kernel: [16502.376740] worker_thread+0x53/0x410
May 19 17:00:10 pve01 kernel: [16503.337078] ? set_kthread_struct+0x50/0x50
May 19 17:00:10 pve01 kernel: [16503.462301] <TASK>
May 19 17:00:11 pve01 kernel: [16504.359709] worker_thread+0x53/0x410
May 19 17:00:12 pve01 kernel: [16505.516017] ? process_one_work+0x3d0/0x3d0
May 19 17:00:13 pve01 kernel: [16506.471940] <TASK>
May 19 17:00:13 pve01 kernel: [16506.477785] ? process_one_work+0x3d0/0x3d0
May 19 17:00:13 pve01 kernel: [16506.480187] ? set_kthread_struct+0x50/0x50
May 19 17:00:14 pve01 kernel: [16507.436723] ? set_kthread_struct+0x50/0x50
May 19 17:00:14 pve01 kernel: [16507.437515] </TASK>
May 19 17:00:15 pve01 kernel: [16508.588336] ? set_kthread_struct+0x50/0x50
May 19 17:00:16 pve01 kernel: [16509.484090] kthread+0x127/0x150
May 19 17:00:16 pve01 kernel: [16509.543859] worker_thread+0x53/0x410
May 19 17:00:16 pve01 kernel: [16509.613053] </TASK>
May 19 17:00:17 pve01 kernel: [16510.629877] Call Trace:
May 19 17:00:18 pve01 kernel: [16511.591705] worker_thread+0x53/0x410
May 19 17:00:18 pve01 kernel: [16511.660203] worker_thread+0x53/0x410
May 19 17:00:19 pve01 kernel: [16512.615358] ? throttle_active_work+0xe2/0x1f0
May 19 17:00:19 pve01 kernel: [16512.678853] ? throttle_active_work+0xe2/0x1f0
May 19 17:00:20 pve01 kernel: [16513.637848] Call Trace:
May 19 17:00:20 pve01 kernel: [16513.640556] worker_thread+0x53/0x410
May 19 17:00:21 pve01 kernel: [16514.600594] worker_thread+0x53/0x410
May 19 17:00:21 pve01 kernel: [16514.668270] worker_thread+0x53/0x410
May 19 17:00:22 pve01 kernel: [16515.688179] worker_thread+0x53/0x410
we use a ceph pool, zfs replication, cgroup-v1
the bug was present only on the computing hosts
Using kernel 5.13.19-6-pve instead of 5.15.35-1-pve solved the issue.