Are these kernel panics? Now what?

proksmoks

Member
Oct 29, 2020
16
0
6
112
City of Justice area
Hi, for a while I'm having trouble with my Proxmox setup. It's probably not Proxmox related and my hardware is bit exotic, but maybe someone can point me in to the right direction.

After a few days, sometimes weeks, I get what I think are kernel panics (??), dmesg gives:

Code:
[844132.205609] Call Trace:
[844132.205614]  __schedule+0x2e6/0x6f0
[844132.205616]  schedule+0x33/0xa0
[844132.205621]  spl_panic+0xf9/0xfb [spl]
[844132.205625]  ? spl_kmem_cache_alloc+0x7c/0x770 [spl]
[844132.205628]  ? spl_kmem_cache_alloc+0x14d/0x770 [spl]
[844132.205630]  ? __wake_up_common_lock+0x8c/0xc0
[844132.205677]  ? zio_taskq_member.isra.14.constprop.20+0x70/0x70 [zfs]
[844132.205701]  arc_buf_type.isra.23+0x4e/0x50 [zfs]
[844132.205745]  arc_change_state.isra.29+0x27/0x480 [zfs]
[844132.205765]  arc_freed+0xa7/0xc0 [zfs]
[844132.205816]  zio_free_sync+0x52/0x100 [zfs]
[844132.205849]  spa_free_sync_cb+0x3b/0x50 [zfs]
[844132.205881]  ? spa_avz_build+0xf0/0xf0 [zfs]
[844132.205901]  bplist_iterate+0xd1/0x140 [zfs]
[844132.205949]  spa_sync+0x5c9/0xff0 [zfs]
[844132.205951]  ? mutex_lock+0x12/0x30
[844132.205998]  ? spa_txg_history_init_io+0x104/0x110 [zfs]
[844132.206032]  txg_sync_thread+0x2e1/0x4a0 [zfs]
[844132.206066]  ? txg_thread_exit.isra.13+0x60/0x60 [zfs]
[844132.206070]  thread_generic_wrapper+0x74/0x90 [spl]
[844132.206072]  kthread+0x120/0x140
[844132.206075]  ? __thread_exit+0x20/0x20 [spl]
[844132.206076]  ? kthread_park+0x90/0x90
[844132.206078]  ret_from_fork+0x35/0x40

There are usually more messages like this but I don't know what I am looking at. I'm left with a weirdly unresponsive pve. Most of the 10+ containers keep functioning normally. But also most won't respond to a shutdown -h command. Killing stuff with kill -9 $pid doesn't work. The node won't reboot or shutdown either and I have to resort to a power cycle.

I've let memtest86 run its course over a night but zero errors. The hardware is a Fujitsu 3643 board and i5-8400T cpu (an engineering sample!) that should give me an very low power setup.

Any hints very much appreciated.
 
Last edited:
Hi,

The actual error message should be a couple of lines above this (maybe starting with "INFO"). This only shows the call trace which led to the issue. Could you post the full message and the output of pveversion -v?
 
This particular message wasn't written to the logs (due to the problems?). I grep-ed in /var/log/* but the last message like this was:

Code:
Nov 29 09:37:00 pve systemd[1]: Starting Proxmox VE replication runner...
Nov 29 09:37:00 pve systemd[1]: pvesr.service: Succeeded.
Nov 29 09:37:00 pve systemd[1]: Started Proxmox VE replication runner.
Nov 29 09:37:11 pve kernel: [844011.373667] INFO: task txg_sync:1033 blocked for more than 1087 seconds.
Nov 29 09:37:11 pve kernel: [844011.373672]       Tainted: P           O      5.4.128-1-pve #1
Nov 29 09:37:11 pve kernel: [844011.373673] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 29 09:37:11 pve kernel: [844011.373675] txg_sync        D    0  1033      2 0x80004000
Nov 29 09:37:11 pve kernel: [844011.373677] Call Trace:
Nov 29 09:37:11 pve kernel: [844011.373681]  __schedule+0x2e6/0x6f0
Nov 29 09:37:11 pve kernel: [844011.373684]  schedule+0x33/0xa0
Nov 29 09:37:11 pve kernel: [844011.373689]  spl_panic+0xf9/0xfb [spl]
Nov 29 09:37:11 pve kernel: [844011.373692]  ? spl_kmem_cache_alloc+0x7c/0x770 [spl]
Nov 29 09:37:11 pve kernel: [844011.373695]  ? spl_kmem_cache_alloc+0x14d/0x770 [spl]
Nov 29 09:37:11 pve kernel: [844011.373697]  ? __wake_up_common_lock+0x8c/0xc0
Nov 29 09:37:11 pve kernel: [844011.373742]  ? zio_taskq_member.isra.14.constprop.20+0x70/0x70 [zfs]
Nov 29 09:37:11 pve kernel: [844011.373762]  arc_buf_type.isra.23+0x4e/0x50 [zfs]
Nov 29 09:37:11 pve kernel: [844011.373782]  arc_change_state.isra.29+0x27/0x480 [zfs]
Nov 29 09:37:11 pve kernel: [844011.373803]  arc_freed+0xa7/0xc0 [zfs]
Nov 29 09:37:11 pve kernel: [844011.373841]  zio_free_sync+0x52/0x100 [zfs]
Nov 29 09:37:11 pve kernel: [844011.373933]  ? spa_avz_build+0xf0/0xf0 [zfs]
Nov 29 09:37:11 pve kernel: [844011.373957]  bplist_iterate+0xd1/0x140 [zfs]
Nov 29 09:37:11 pve kernel: [844011.374020]  spa_sync+0x5c9/0xff0 [zfs]
Nov 29 09:37:11 pve kernel: [844011.374055]  ? spa_txg_history_init_io+0x104/0x110 [zfs]
Nov 29 09:37:11 pve kernel: [844011.374123]  ? txg_thread_exit.isra.13+0x60/0x60 [zfs]
Nov 29 09:37:11 pve kernel: [844011.374128]  kthread+0x120/0x140
Nov 29 09:37:11 pve kernel: [844011.374132]  ? kthread_park+0x90/0x90
Nov 29 09:38:00 pve systemd[1]: Starting Proxmox VE replication runner...
Nov 29 09:38:00 pve systemd[1]: pvesr.service: Succeeded.

There are some more txg_sync:1033 blocked for more than X seconds before this one.

pveversion:

Code:
root@pve:~# pveversion -v
proxmox-ve: 6.4-1 (running kernel: 5.4.128-1-pve)
pve-manager: 6.4-13 (running version: 6.4-13/9f411e79)
pve-kernel-5.4: 6.4-5
pve-kernel-helper: 6.4-5
pve-kernel-5.4.128-1-pve: 5.4.128-2
pve-kernel-5.4.114-1-pve: 5.4.114-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.2-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-3
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.1.13-2
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-1
pve-cluster: 6.4-1
pve-container: 3.3-6
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.2-4
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.5-pve1~bpo10+1

I notice txg_sync points to ZFS. I run the pool on two 2.5 inch drives (again, to keep power usage down). They're in a ZFS mirror and 1 of them is connected on USB3.

I also notice apt update & apt --list upgradable lists quite some packages.
 
Last edited:
I notice txg_sync points to ZFS. I run the pool on two 2.5 inch drives (again, to keep power usage down). They're in a ZFS mirror and 1 of them is connected on USB3.
This could lead to the timeouts you saw. Please post zpool status -v (with CODE tags)
 
Code:
root@pve:~# zpool status -v
  pool: vijver
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: resilvered 2.54T in 09:39:43 with 0 errors on Tue Nov 16 03:25:03 2021
config:

        NAME        STATE     READ WRITE CKSUM
        vijver      ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdc1    ONLINE       0     0     1

errors: No known data errors

Please note that the reservering followed when I replaced a failed 2.5" drive. I'm not sure what the CKSUM column means but I'm worried.

zpool history shows nothing special but zpool events tells this:
Code:
Dec  1 2021 01:01:37.483128881 ereport.fs.zfs.checksum

Meanwhile I'm clinging to the No known data errors statement and cherish the backups.
 
That output shows what I expected. You get rid of the cksum error as described: clear and do a manual scrub in order to get a current integrity test.

While scrubbing, monitor the disk i/O usage with dstat/iostat or any other monitoring tool, especially the i/o times. I would suspect that the USB device is slower, maybe much slower.
 
  • Like
Reactions: proksmoks
That output shows what I expected. You get rid of the cksum error as described: clear and do a manual scrub in order to get a current integrity test.

While scrubbing, monitor the disk i/O usage with dstat/iostat or any other monitoring tool, especially the i/o times. I would suspect that the USB device is slower, maybe much slower.
Allright, I will let know what happens.
 
Well, good morning, the scrub finished:
Code:
Every 10.0s: zpool status -vv                                                                                                               pve: Fri Dec  3 09:53:08 2021
  pool: vijver
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 112K in 14:35:50 with 0 errors on Fri Dec  3 02:53:21 2021

config:
        NAME        STATE     READ WRITE CKSUM
        vijver      ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdc1    ONLINE       0     0     7

errors: No known data errors

The checksum error seems to be generated at 1 AM every day. The scrub seems to have added a few. I cannot find anything in the logs, I looked at CRON, usb driver messages around 1 AM, and other activities but I didn't see anything.

I checked the Solaris ZFS Administration Guide, particularly the Determining the Type of Device Failure part, and I think low numbers in the CKSUM column, especially when they're not accompanied by driver messages (i.e. USB in my case) are not too worrisome. Still, the guide is for Solaris, not zfs-linux.

I have a lot of apt updates to do on my node, I think that will be next on my list.
 
I think low numbers in the CKSUM column
I like systems without errors and that is not normal to have those.
The checksum error seems to be generated at 1 AM every day. The scrub seems to have added a few. I cannot find anything in the logs, I looked at CRON, usb driver messages around 1 AM, and other activities but I didn't see anything.
If the I/O is just slow, it will not in any log. There are a lot of errors that are not detected by drivers and stuff. Most silent data corruption errors are especially not detectable by the driver, therfore ZFS checks specifically for that by reading the data and comparing the checksums and if it does not match, you have a problem. You should not have read errors.

As already stated, I think that the problem is in the I/O path. I would never run a pool with internal and external devices. External ones are more likely to not work properly, what you can see in your example. I recommend moving to a all-internal setup.
 
  • Like
Reactions: proksmoks
I like systems without errors and that is not normal to have those.

If the I/O is just slow, it will not in any log. There are a lot of errors that are not detected by drivers and stuff. Most silent data corruption errors are especially not detectable by the driver, therfore ZFS checks specifically for that by reading the data and comparing the checksums and if it does not match, you have a problem. You should not have read errors.

As already stated, I think that the problem is in the I/O path. I would never run a pool with internal and external devices. External ones are more likely to not work properly, what you can see in your example. I recommend moving to a all-internal setup.
Yeah, I hear you. In a quest to bring down the power consumption I upgraded (?) to 2.5inch disks.

(In case anyone wonders what I'm doing, since I couldn't find a replacement 5TB 2.5inch disk I bought an WD elements external disk. These have a USB interface directly on the PCB; there's not an enclosure that contains a Sata -> USB3 adapter. So unless I start swapping PCBs I can only connect this drive through USB.)
 
(In case anyone wonders what I'm doing, since I couldn't find a replacement 5TB 2.5inch disk I bought an WD elements external disk. These have a USB interface directly on the PCB; there's not an enclosure that contains a Sata -> USB3 adapter. So unless I start swapping PCBs I can only connect this drive through USB.)
Oh, I was not aware that direct USB is at thing.
 
As 2.5" hdd external drives where some 25% cheaper than exactly the same hdd without Sata > USB adapter, enclosure and cable, this was, for a short time, a source of cheap not too expensive 2.5" harddisks - if you were willing to forego the warranty.

Here's a (complete) teardown on youtube of a WD drive.

https://youtu.be/wP4l_L81NKw

I realize the quest for lower power consumption has gone too far and have my sights set on a 3.5" NAS hdd as replacement.

Still, the 2.5" drives are quieter, too.
 
Last edited:
As 2.5" hdd external drives where some 25% cheaper than exactly the same hdd without Sata > USB adapter, enclosure and cable, this was, for a short time, a source of cheap not too expensive 2.5" harddisks - if you were willing to forego the warranty.

Here's a (complete) teardown on youtube of a WD drive.

https://youtu.be/wP4l_L81NKw

I realize the quest for lower power consumption has gone too far and have my sights set on a 3.5" NAS hdd as replacement.

Still, the 2.5" drives are quieter, too.

There are 5TB 2.5'' Seagate Barracuda Compute drives. However the 3-5TB 2.5'' drives are 15mm high and don't fit everywhere.
 
There are 5TB 2.5'' Seagate Barracuda Compute drives. However the 3-5TB 2.5'' drives are 15mm high and don't fit everywhere.

Behold, and be awe struck by my well laid out setup.

that_bears_the_hallmark_of_a_real_pro.JPG

I have no problem with 15mm. Or 16mm.

Right now a Seagate 5TB disk, the ST5000LM000 is sold locally for 105 euros ($119) in external form and for 125 euros ($141) as a naked disk.
 
PicoPSU - and a PicoUPS, with an unenclosed motorcycle battery on the floor. :)

The "unenclosed PSU" is actually the UPS, with a brick psu and a battery hooked up to it. It works a treat.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!