Hello,
I run Proxmox v6.3 and I created a raidz-1 (3 disks with 1 parity) pool from these 3 SSDs:
(I had to create it manually because of the difference in size).
It initially worked well and I was able to move my VM storage to it. But after a few days running it kernel panicked and hung the system, which forced me to reboot.
Upon reboot, during the import of the zpool, I now consistently get the following panic:
pvesm is then hanging and I need to reboot again.
I scoured the forum and disabled the zfs-import service, which allows me to avoid the kernel panic and get back control of my proxmox host. I restored my VMs backup, so all is good.
The zpool seems healthy:
I'm about to recreate the zpool, but I would like to know if there is something obvious I'm doing bad, and if I will fail again.
Here is the pveversion output:
Thanks a lot for any hint!
Some info about my system. It's all consumer grade.
I run Proxmox v6.3 and I created a raidz-1 (3 disks with 1 parity) pool from these 3 SSDs:
(I had to create it manually because of the difference in size).
It initially worked well and I was able to move my VM storage to it. But after a few days running it kernel panicked and hung the system, which forced me to reboot.
Upon reboot, during the import of the zpool, I now consistently get the following panic:
Code:
Dec 29 22:29:23 pve1 kernel: PANIC: blkptr at 000000004c4feb84 has invalid TYPE 66
Dec 29 22:29:23 pve1 kernel: Showing stack for process 9353
Dec 29 22:29:23 pve1 kernel: CPU: 6 PID: 9353 Comm: zpool Tainted: P O 5.4.78-2-pve #1
Dec 29 22:29:23 pve1 kernel: Hardware name: System manufacturer System Product Name/Z170-A, BIOS 3802 03/15/2018
Dec 29 22:29:23 pve1 kernel: Call Trace:
Dec 29 22:29:23 pve1 kernel: dump_stack+0x6d/0x9a
Dec 29 22:29:23 pve1 kernel: spl_dumpstack+0x29/0x2b [spl]
Dec 29 22:29:23 pve1 kernel: vcmn_err.cold.1+0x60/0x94 [spl]
Dec 29 22:29:23 pve1 kernel: ? zio_execute+0x99/0xf0 [zfs]
Dec 29 22:29:23 pve1 kernel: ? _cond_resched+0x19/0x30
Dec 29 22:29:23 pve1 kernel: ? __kmalloc+0x197/0x280
Dec 29 22:29:23 pve1 kernel: ? sg_kmalloc+0x19/0x30
Dec 29 22:29:23 pve1 kernel: zfs_panic_recover+0x6f/0x90 [zfs]
Dec 29 22:29:23 pve1 kernel: ? spa_sync_allpools+0x130/0x130 [zfs]
Dec 29 22:29:23 pve1 kernel: zfs_blkptr_verify+0x265/0x400 [zfs]
Dec 29 22:29:23 pve1 kernel: ? abd_alloc+0x280/0x480 [zfs]
Dec 29 22:29:23 pve1 kernel: zio_read+0x42/0xc0 [zfs]
Dec 29 22:29:23 pve1 kernel: ? spa_sync_allpools+0x130/0x130 [zfs]
Dec 29 22:29:23 pve1 kernel: spa_load_verify_cb+0x186/0x1d0 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_visitbp+0x1f3/0x9e0 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_visitbp+0x359/0x9e0 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_visitbp+0x359/0x9e0 [zfs]
Dec 29 22:29:23 pve1 kernel: ? arc_read+0x475/0x1020 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_dnode+0xb6/0x1d0 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_visitbp+0x824/0x9e0 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_visitbp+0x359/0x9e0 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_visitbp+0x359/0x9e0 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_visitbp+0x359/0x9e0 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_visitbp+0x359/0x9e0 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_visitbp+0x359/0x9e0 [zfs]
Dec 29 22:29:23 pve1 kernel: ? arc_read+0x475/0x1020 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_dnode+0xb6/0x1d0 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_visitbp+0x6ab/0x9e0 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_impl+0x1e3/0x480 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_dataset_resume+0x46/0x50 [zfs]
Dec 29 22:29:23 pve1 kernel: ? spa_sync+0xfa0/0xfa0 [zfs]
Dec 29 22:29:23 pve1 kernel: traverse_pool+0x181/0x1b0 [zfs]
Dec 29 22:29:23 pve1 kernel: ? spa_sync+0xfa0/0xfa0 [zfs]
Dec 29 22:29:23 pve1 kernel: spa_load+0x1159/0x13b0 [zfs]
Dec 29 22:29:23 pve1 kernel: spa_load_best+0x57/0x2d0 [zfs]
Dec 29 22:29:23 pve1 kernel: ? zpool_get_load_policy+0x1aa/0x1c0 [zcommon]
Dec 29 22:29:23 pve1 kernel: spa_import+0x1ea/0x7f0 [zfs]
Dec 29 22:29:23 pve1 kernel: ? nvpair_value_common.part.13+0x14d/0x170 [znvpair]
Dec 29 22:29:23 pve1 kernel: zfs_ioc_pool_import+0x12d/0x150 [zfs]
Dec 29 22:29:23 pve1 kernel: zfsdev_ioctl+0x6db/0x8f0 [zfs]
Dec 29 22:29:23 pve1 kernel: ? lru_cache_add_active_or_unevictable+0x39/0xb0
Dec 29 22:29:23 pve1 kernel: do_vfs_ioctl+0xa9/0x640
Dec 29 22:29:23 pve1 kernel: ? handle_mm_fault+0xc9/0x1f0
Dec 29 22:29:23 pve1 kernel: ksys_ioctl+0x67/0x90
Dec 29 22:29:23 pve1 kernel: __x64_sys_ioctl+0x1a/0x20
Dec 29 22:29:23 pve1 kernel: do_syscall_64+0x57/0x190
Dec 29 22:29:23 pve1 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Dec 29 22:29:23 pve1 kernel: RIP: 0033:0x7f058c627427
Dec 29 22:29:23 pve1 kernel: Code: 00 00 90 48 8b 05 69 aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 aa 0c 00 f7 d8 64 89 01 48
Dec 29 22:29:23 pve1 kernel: RSP: 002b:00007ffcb5f23498 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Dec 29 22:29:23 pve1 kernel: RAX: ffffffffffffffda RBX: 00007ffcb5f23510 RCX: 00007f058c627427
Dec 29 22:29:23 pve1 kernel: RDX: 00007ffcb5f23510 RSI: 0000000000005a02 RDI: 0000000000000003
Dec 29 22:29:23 pve1 kernel: RBP: 00007ffcb5f27400 R08: 0000556970d4ac40 R09: 0000000000000079
Dec 29 22:29:23 pve1 kernel: R10: 0000556970d1a010 R11: 0000000000000246 R12: 0000556970d1b430
Dec 29 22:29:23 pve1 kernel: R13: 0000556970d306c0 R14: 0000000000000000 R15: 0000000000000000
pvesm is then hanging and I need to reboot again.
I scoured the forum and disabled the zfs-import service, which allows me to avoid the kernel panic and get back control of my proxmox host. I restored my VMs backup, so all is good.
The zpool seems healthy:
Code:
pool: vmdisks
id: 7025696728242074529
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:
vmdisks ONLINE
raidz1-0 ONLINE
ata-PNY_CS900_240GB_SSD_PNY43191910250107DA1 ONLINE
ata-SanDisk_SSD_PLUS_240GB_1835AF801426 ONLINE
ata-Crucial_CT256M550SSD1_14150DF074A0 ONLINE
I'm about to recreate the zpool, but I would like to know if there is something obvious I'm doing bad, and if I will fail again.
Here is the pveversion output:
Code:
proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.78-1-pve: 5.4.78-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-2
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-3
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.6-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-2
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-2
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1
Thanks a lot for any hint!
Some info about my system. It's all consumer grade.
Last edited: