ZFS 0.8.4 crashes kernel

elurex

Active Member
Oct 28, 2015
204
14
38
Taiwan
Is anyone else experiencing following zfs crash error like me? it already happened to two of my pve nodes already

Code:
Jun 10 15:14:40 pveLA01 kernel: [2306462.646839] Code: 75 0e 4d 89 f9 41 f6 47 0b 04 0f 84 f4 fe ff ff 4c 89 ff e8 1a f5 01 00 49 89 c1 e9 e4 fe ff ff 41 8b 41 20 49 8b 39 4c 01 d0 <48> 8b 18 48 89 c1 49 33 99 70 01 00 00 4c 89 d0 48 0f c9 48 31 cb
Jun 10 15:14:40 pveLA01 kernel: [2306462.650572] R10: dd65a112c2f52240 R11: ffff993269741560 R12: 0000000000042d00
Jun 10 15:14:40 pveLA01 kernel: [2306462.985224] RBP: ffffaac419b479f0 R08: ffff991aa08b0040 R09: ffff991aa0007b80
Jun 10 15:14:40 pveLA01 kernel: [2306462.988381] CR2: 000000c000e85000 CR3: 0000002eeaf26006 CR4: 00000000007626e0
Jun 10 15:14:40 pveLA01 kernel: [2306462.990328] PKRU: 55555554
Jun 10 15:14:40 pveLA01 kernel: [2306462.992162]  spl_kmem_zalloc+0xe9/0x140 [spl]
Jun 10 15:14:40 pveLA01 kernel: [2306462.993985]  dmu_write_uio_dnode+0x4c/0x140 [zfs]
Jun 10 15:14:40 pveLA01 kernel: [2306462.995688]  zfs_write+0xa1b/0xed0 [zfs]
Jun 10 15:14:40 pveLA01 kernel: [2306462.997247]  zpl_iter_write+0xee/0x130 [zfs]
Jun 10 15:14:40 pveLA01 kernel: [2306462.998627]  vfs_write+0xab/0x1b0
Jun 10 15:14:40 pveLA01 kernel: [2306462.999925]  do_syscall_64+0x57/0x190
Jun 10 15:14:40 pveLA01 kernel: [2306463.002056] RSP: 002b:00007ff6d1152840 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
Jun 10 15:14:40 pveLA01 kernel: [2306463.003825] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000072
Jun 10 15:14:40 pveLA01 kernel: [2306463.011854] ---[ end trace 5fcf6e5bb7cbecb4 ]---
Jun 10 15:14:41 pveLA01 kernel: [2306463.772578] RSP: 0018:ffffaac4a742f9b0 EFLAGS: 00010282
Jun 10 15:14:41 pveLA01 kernel: [2306464.099178] RSP: 0018:ffffaac4a742f9b0 EFLAGS: 00010282
Jun 10 15:14:41 pveLA01 kernel: [2306464.101652] R10: dd65a112c2f52240 R11: ffff993269741560 R12: 0000000000042d00
Jun 10 15:14:41 pveLA01 kernel: [2306464.103360] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 10 15:15:03 pveLA01 kernel: [2306486.413641] ---[ end trace 5fcf6e5bb7cbecb7 ]---
Jun 10 15:15:03 pveLA01 kernel: [2306486.433023] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jun 10 15:15:14 pveLA01 kernel: [2306496.459072] RAX: dd65a112c2f52240 RBX: 0000000000000000 RCX: 0000000000000000
Jun 10 15:15:14 pveLA01 kernel: [2306496.461582] R13: 0000000000000008 R14: 00000000ffffffff R15: ffff991aa0007b80
Jun 10 15:15:14 pveLA01 kernel: [2306496.463984] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 10 15:15:14 pveLA01 kernel: [2306496.466197]  ? __vmalloc_node_range+0xd4/0x270
Jun 10 15:15:14 pveLA01 kernel: [2306496.468203]  alloc_counters.isra.11+0x2b/0x130 [ip6_tables]
Jun 10 15:15:14 pveLA01 kernel: [2306496.470021]  ipv6_getsockopt+0xa1/0xe0
Jun 10 15:15:14 pveLA01 kernel: [2306496.471695]  __x64_sys_getsockopt+0x24/0x30
Jun 10 15:15:14 pveLA01 kernel: [2306496.473285] Code: 48 8b 0d c9 08 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 37 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 96 08 0c 00 f7 d8 64 89 01 48
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.601639] general protection fault: 0000 [#9] SMP NOPTI
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.602247] CPU: 10 PID: 1567647 Comm: zvol Tainted: P      D    OE     5.4.34-1-pve #1
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.603444] RIP: 0010:__kmalloc_node+0x198/0x330
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.604020] Code: 75 0e 4d 89 f9 41 f6 47 0b 04 0f 84 f4 fe ff ff 4c 89 ff e8 1a f5 01 00 49 89 c1 e9 e4 fe ff ff 41 8b 41 20 49 8b 39 4c 01 d0 <48> 8b 18 48 89 c1 49 33 99 70 01 00 00 4c 89 d0 48 0f c9 48 31 cb
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.605084] RSP: 0018:ffffaac415e83ba0 EFLAGS: 00010282
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.605583] RAX: dd65a112c2f52240 RBX: 0000000000000000 RCX: 0000000000000000
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.606548] RBP: ffffaac415e83be0 R08: ffff991aa08b0040 R09: ffff991aa0007b80
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.607009] R10: dd65a112c2f52240 R11: 00000000842c7000 R12: 0000000000042d00
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.607914] FS:  0000000000000000(0000) GS:ffff991aa0880000(0000) knlGS:0000000000000000
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.608776] CR2: 0000011e159ad000 CR3: 0000002276a0a002 CR4: 00000000007626e0
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.609598] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.610753]  ? spl_kmem_zalloc+0xe9/0x140 [spl]
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.612251]  dmu_write_uio_dnode+0x4c/0x140 [zfs]
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.613671]  kthread+0x120/0x140
Jun 10 15:15:44 gpu01-la3 kernel: [2306526.614967] Modules linked in: tcp_diag(E) inet_diag(E) rbd(E) libceph(E) veth(E) ebtable_filter(E) ebtables(E) ip_set(E) ip6table_raw(E) iptable_raw(E) ip6table_filter(E) ip6_tables(E) sctp(E) iptable_filter(E) bpfilter(E) binfmt_misc(E) ipmi_watchdog(E) bonding(E) nfnetlink_log(E) nfnetlink(E) ipmi_ssif(E) snd_hda_codec_hdmi(E) intel_rapl_msr(E) intel_rapl_common(E) isst_if_common(E) skx_edac(E) nfit(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) aesni_intel(E) crypto_simd(E) snd_hda_intel(E) joydev(E) input_leds(E) cryptd(E) snd_intel_dspcfg(E) glue_helper(E) snd_hda_codec(E) snd_hda_core(E) ast(E) snd_hwdep(E) intel_cstate(E) drm_vram_helper(E) snd_pcm(E) ttm(E) snd_timer(E) drm_kms_helper(E) intel_rapl_perf(E) snd(E) drm(E) soundcore(E) i2c_algo_bit(E) fb_sys_fops(E) mei_me(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) mei(E) ioatdma(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E)

My Package Version
Code:
proxmox-ve: 6.2-1 (running kernel: 5.4.41-1-pve)
pve-manager: 6.2-6 (running version: 6.2-6/ee1d7754)
pve-kernel-5.4: 6.2-2
pve-kernel-helper: 6.2-2
pve-kernel-5.3: 6.1-6
pve-kernel-5.0: 6.0-11
pve-kernel-5.4.41-1-pve: 5.4.41-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.13-1-pve: 5.3.13-1
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 14.2.9-pve1
ceph-fuse: 14.2.9-pve1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 7.6-1
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libproxmox-acme-perl: 1.0.4
libpve-access-control: 6.1-1
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-3
libpve-guest-common-perl: 3.0-10
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-8
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve2
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-7
pve-cluster: 6.1-8
pve-container: 3.1-8
pve-docs: 6.2-4
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.1-2
pve-firmware: 3.1-1
pve-ha-manager: 3.0-9
pve-i18n: 2.1-3
pve-qemu-kvm: 5.0.0-4
pve-xtermjs: 4.3.0-1
qemu-server: 6.2-3
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.4-pve1
 
Your log shows that you run an older kernel (5.4.34-1-pve) and not the one you show with your pveversion -v.?

Please check again.
 
This is very consistent with 3 of my servers they all have following error

Code:
kernel: [985733.249586] CPU: 34 PID: 3981331 Comm: apparmor_parser Tainted: P      D    O      5.4.44-2-pve #1
kernel: [985733.248968] general protection fault: 0000 [#11] SMP NOPTI
kernel: [985733.250225] Hardware name: GIGABYTE G291-280-00/MG51-G21-00, BIOS R06 11/19/2019
kernel: [985733.250827] RIP: 0010:__kmalloc_node+0x198/0x330
kernel: [985733.251352] Code: 75 0e 4d 89 f9 41 f6 47 0b 04 0f 84 f4 fe ff ff 4c 89 ff e8 3a f5 01 00 49 89 c1 e9 e4 fe ff ff 41 8b 41 20 49 8b 39 4c 01 d0 <48> 8b 18 48 89 c1 49 33 99 70 01 00 00 4c 89 d0 48 0f c9 48 31 cb
kernel: [985733.252438] RSP: 0018:ffff9c485f583b00 EFLAGS: 00010206
kernel: [985733.252987] RAX: 471f4ca497315cf6 RBX: 0000000000000000 RCX: 0000000000000000
kernel: [985733.253575] RDX: 000000000078c229 RSI: 0000000000042d00 RDI: 0000000000030040
kernel: [985733.254200] RBP: ffff9c485f583b40 R08: ffff8e329f630040 R09: ffff8e1aa0007b80
kernel: [985733.254820] R10: 471f4ca497315cf6 R11: 0000000000000000 R12: 0000000000042d00
kernel: [985733.255355] R13: 0000000000000008 R14: 00000000ffffffff R15: ffff8e1aa0007b80
kernel: [985733.255880] FS:  00007f7af93b8740(0000) GS:ffff8e329f600000(0000) knlGS:0000000000000000
kernel: [985733.256395] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: [985733.256903] CR2: 00007f7af942b1e0 CR3: 00000023d1260004 CR4: 00000000007626e0
kernel: [985733.257380] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kernel: [985733.257933] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
kernel: [985733.258458] PKRU: 55555554
kernel: [985733.258946] Call Trace:
kernel: [985733.259393]  ? spl_kmem_zalloc+0xe9/0x140 [spl]
kernel: [985733.259815]  spl_kmem_zalloc+0xe9/0x140 [spl]
kernel: [985733.260280]  dmu_buf_hold_array_by_dnode+0x84/0x480 [zfs]
kernel: [985733.260703]  ? spl_kmem_alloc+0xec/0x140 [spl]
kernel: [985733.261160]  dmu_read_uio_dnode+0x49/0xf0 [zfs]
kernel: [985733.261630]  ? zfs_rangelock_enter+0x150/0x580 [zfs]
kernel: [985733.262083]  dmu_read_uio_dbuf+0x45/0x60 [zfs]
kernel: [985733.262539]  zfs_read+0x12d/0x470 [zfs]
kernel: [985733.263382]  zpl_iter_read+0xfd/0x170 [zfs]
kernel: [985733.264125]  __vfs_read+0x29/0x40
kernel: [985733.264848]  ksys_read+0x61/0xe0
kernel: [985733.265550]  do_syscall_64+0x57/0x190
kernel: [985733.266305] RIP: 0033:0x7f7af94a5461
kernel: [985733.267408] RSP: 002b:00007fff89ab2b58 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
kernel: [985733.268184] RDX: 0000000000000800 RSI: 0000565152bcf7a0 RDI: 0000000000000003
kernel: [985733.268982] R10: 0000565152bbd010 R11: 0000000000000246 R12: 00007f7af9572760
kernel: [985733.269844] Modules linked in: veth rbd libceph ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables sctp iptable_filter bpfilter binfmt_misc ipmi_watchdog bonding nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common isst_if_common skx_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass ipmi_ssif crct10dif_pclmul crc32_pclmul ghash_clmulni_intel amdgpu snd_hda_codec_hdmi aesni_intel crypto_simd input_leds joydev cryptd glue_helper snd_hda_intel amd_iommu_v2 ast snd_intel_dspcfg gpu_sched drm_vram_helper snd_hda_codec intel_cstate ttm snd_hda_core snd_hwdep drm_kms_helper snd_pcm intel_rapl_perf snd_timer drm snd soundcore i2c_algo_bit fb_sys_fops syscopyarea mei_me sysfillrect sysimgblt ioatdma mei ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter mac_hid vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp sunrpc libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 zfs(PO) zunicode(PO)
kernel: [985733.275832] ---[ end trace 4c558571425b8dc7 ]---
kernel: [985733.287227] RIP: 0010:__kmalloc_node+0x198/0x330
kernel: [985733.288924] RSP: 0018:ffff9c486b2fbb00 EFLAGS: 00010206
kernel: [985733.290697] RBP: ffff9c486b2fbb40 R08: ffff8e329f630040 R09: ffff8e1aa0007b80
kernel: [985733.291726] R13: 0000000000000008 R14: 00000000ffffffff R15: ffff8e1aa0007b80
kernel: [985733.292712] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: [985733.293863] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kernel: [985733.294483] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
kernel: [985733.295069] PKRU: 55555554

My PVE version information
Code:
proxmox-ve: 6.2-1 (running kernel: 5.4.44-2-pve)
pve-manager: 6.2-6 (running version: 6.2-6/ee1d7754)
pve-kernel-5.4: 6.2-4
pve-kernel-helper: 6.2-4
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-5.4.41-1-pve: 5.4.41-1
pve-kernel-4.15: 5.4-6
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-4.15.18-18-pve: 4.15.18-44
pve-kernel-4.15.18-12-pve: 4.15.18-36
ceph: 14.2.9-pve1
ceph-fuse: 14.2.9-pve1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 7.6-1
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.4
libpve-access-control: 6.1-1
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-3
libpve-guest-common-perl: 3.0-10
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-8
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-8
pve-cluster: 6.1-8
pve-container: 3.1-8
pve-docs: 6.2-4
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-2
pve-firmware: 3.1-1
pve-ha-manager: 3.0-9
pve-i18n: 2.1-3
pve-qemu-kvm: 5.0.0-4
pve-xtermjs: 4.3.0-1
qemu-server: 6.2-3
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.4-pve1
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!