Importing ZFS pool causes Kernel panic on HP DL380P Gen8

dratcliffe

New Member
Mar 24, 2022
3
0
1
Hello All,

I'm relatively new to Proxmox and have been using it for the past 6 months without issue. I currently have a HP DL380P Gen8 server that is using a P420 RAID controller card in HBA Passthrough mode with 8 1TB Samsung QVO SSDs for storage.

I've been using it in ZFS for quite some time, and realized I forgot to enable auto trim. I enabled auto trim on the pool and then it all went sideways. The kernel goes into a panic and the HP fans shoot up to 100% utilization. The pool goes offline due to I/O errors. I've managed to get it to import briefly and confirmed data shows available, but after about 15 seconds the lockup and panic occurs.

proxmox-ve: 7.1-1 (running kernel: 5.13.19-2-pve)
pve-manager: 7.1-8 (running version: 7.1-8/5b267f33)
pve-kernel-helper: 7.1-13
pve-kernel-5.13: 7.1-5
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph-fuse: 15.2.15-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-14
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-2
proxmox-backup-client: 2.1.5-1
proxmox-backup-file-restore: 2.1.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-7
pve-cluster: 7.1-3
pve-container: 4.1-3
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-6
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.1-2
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1

1648131751947.png

Mar 24 08:41:26 proxhost kernel: hpsa 0000:02:00.0: Controller lockup detected: 0xffff0000 after 30
Mar 24 08:41:26 proxhost kernel: hpsa 0000:02:00.0: Telling controller to do a CHKPT
Mar 24 08:41:26 proxhost kernel: hpsa 0000:02:00.0: failed 1 commands in fail_all
Mar 24 08:41:26 proxhost kernel: hpsa 0000:02:00.0: Controller lockup detected during reset wait
Mar 24 08:41:26 proxhost kernel: hpsa 0000:02:00.0: scsi 2:0:7:0: reset physical failed Direct-Access ATA Samsung SSD 870 PHYS DRV SSDSmartPathCap- En- Exp=1
Mar 24 08:41:26 proxhost kernel: sd 2:0:7:0: Device offlined - not ready after error recovery
Mar 24 08:41:26 proxhost kernel: sd 2:0:7:0: [sdh] tag#3 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=117s
Mar 24 08:41:26 proxhost kernel: sd 2:0:7:0: [sdh] tag#3 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
Mar 24 08:41:26 proxhost kernel: blk_update_request: I/O error, dev sdh, sector 1509961832 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
Mar 24 08:41:26 proxhost kernel: sd 2:0:1:0: [sdb] tag#10 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:41:26 proxhost kernel: sd 2:0:1:0: [sdb] tag#10 CDB: Read(10) 28 00 00 00 00 00 00 00 f8 00
Mar 24 08:41:26 proxhost kernel: blk_update_request: I/O error, dev sdb, sector 0 op 0x0:(READ) flags 0x4000 phys_seg 31 prio class 0
Mar 24 08:41:26 proxhost kernel: sd 2:0:1:0: [sdb] tag#11 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:41:26 proxhost kernel: sd 2:0:1:0: [sdb] tag#11 CDB: Read(10) 28 00 00 00 00 f8 00 00 08 00
Mar 24 08:41:26 proxhost kernel: blk_update_request: I/O error, dev sdb, sector 248 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Mar 24 08:41:26 proxhost kernel: sd 2:0:1:0: [sdb] tag#12 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:41:26 proxhost kernel: sd 2:0:1:0: [sdb] tag#12 CDB: Read(10) 28 00 00 00 08 00 00 01 00 00
Mar 24 08:41:26 proxhost kernel: blk_update_request: I/O error, dev sdb, sector 2048 op 0x0:(READ) flags 0x0 phys_seg 31 prio class 0
Mar 24 08:41:26 proxhost kernel: sd 2:0:1:0: [sdb] tag#13 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:41:26 proxhost kernel: sd 2:0:1:0: [sdb] tag#13 CDB: Read(10) 28 00 74 70 28 00 00 01 00 00
Mar 24 08:41:26 proxhost kernel: blk_update_request: I/O error, dev sdb, sector 1953507328 op 0x0:(READ) flags 0x0 phys_seg 18 prio class 0
Mar 24 08:41:26 proxhost kernel: sd 2:0:2:0: [sdc] tag#14 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:41:26 proxhost kernel: sd 2:0:2:0: [sdc] tag#14 CDB: Read(10) 28 00 00 00 00 00 00 01 00 00
Mar 24 08:41:26 proxhost kernel: blk_update_request: I/O error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 14 prio class 0
Mar 24 08:41:26 proxhost kernel: sd 2:0:2:0: [sdc] tag#15 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:41:26 proxhost kernel: sd 2:0:2:0: [sdc] tag#15 CDB: Read(10) 28 00 00 00 08 00 00 01 00 00
Mar 24 08:41:26 proxhost kernel: blk_update_request: I/O error, dev sdc, sector 2048 op 0x0:(READ) flags 0x0 phys_seg 27 prio class 0
Mar 24 08:41:26 proxhost kernel: sd 2:0:2:0: [sdc] tag#12 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:41:26 proxhost kernel: sd 2:0:2:0: [sdc] tag#12 CDB: Read(10) 28 00 74 70 28 00 00 00 f8 00
Mar 24 08:41:26 proxhost kernel: blk_update_request: I/O error, dev sdc, sector 1953507328 op 0x0:(READ) flags 0x4000 phys_seg 31 prio class 0
Mar 24 08:41:26 proxhost kernel: sd 2:0:2:0: [sdc] tag#13 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:41:26 proxhost kernel: sd 2:0:2:0: [sdc] tag#13 CDB: Read(10) 28 00 74 70 28 f8 00 00 08 00
Mar 24 08:41:26 proxhost kernel: blk_update_request: I/O error, dev sdc, sector 1953507576 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Mar 24 08:41:26 proxhost kernel: sd 2:0:3:0: [sdd] tag#14 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:41:26 proxhost kernel: sd 2:0:3:0: [sdd] tag#14 CDB: Read(10) 28 00 00 00 00 00 00 00 f8 00

---------------

Mar 24 08:42:06 proxhost pvestatd[1303]: zfs error: cannot open 'ProxStore': pool I/O is currently suspended
Mar 24 08:42:06 proxhost kernel: scsi_io_completion_action: 4 callbacks suppressed
Mar 24 08:42:06 proxhost kernel: sd 2:0:1:0: [sdb] tag#8 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:42:06 proxhost kernel: sd 2:0:1:0: [sdb] tag#8 CDB: Read(10) 28 00 00 00 00 00 00 00 f8 00
Mar 24 08:42:06 proxhost kernel: print_req_error: 6 callbacks suppressed
Mar 24 08:42:06 proxhost kernel: blk_update_request: I/O error, dev sdb, sector 0 op 0x0:(READ) flags 0x4000 phys_seg 31 prio class 0
Mar 24 08:42:06 proxhost kernel: sd 2:0:1:0: [sdb] tag#9 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:42:06 proxhost kernel: sd 2:0:1:0: [sdb] tag#9 CDB: Read(10) 28 00 00 00 00 f8 00 00 08 00
Mar 24 08:42:06 proxhost kernel: blk_update_request: I/O error, dev sdb, sector 248 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Mar 24 08:42:06 proxhost kernel: sd 2:0:1:0: [sdb] tag#19 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:42:06 proxhost kernel: sd 2:0:1:0: [sdb] tag#19 CDB: Read(10) 28 00 00 00 08 00 00 00 f8 00
Mar 24 08:42:06 proxhost kernel: blk_update_request: I/O error, dev sdb, sector 2048 op 0x0:(READ) flags 0x4000 phys_seg 31 prio class 0
Mar 24 08:42:06 proxhost kernel: sd 2:0:1:0: [sdb] tag#16 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:42:06 proxhost kernel: sd 2:0:1:0: [sdb] tag#16 CDB: Read(10) 28 00 00 00 08 f8 00 00 08 00
Mar 24 08:42:06 proxhost kernel: blk_update_request: I/O error, dev sdb, sector 2296 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Mar 24 08:42:06 proxhost kernel: sd 2:0:1:0: [sdb] tag#17 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:42:06 proxhost kernel: sd 2:0:1:0: [sdb] tag#17 CDB: Read(10) 28 00 74 70 28 00 00 00 f8 00
Mar 24 08:42:06 proxhost kernel: blk_update_request: I/O error, dev sdb, sector 1953507328 op 0x0:(READ) flags 0x4000 phys_seg 31 prio class 0
Mar 24 08:42:06 proxhost kernel: sd 2:0:1:0: [sdb] tag#18 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:42:06 proxhost kernel: sd 2:0:1:0: [sdb] tag#18 CDB: Read(10) 28 00 74 70 28 f8 00 00 08 00
Mar 24 08:42:06 proxhost kernel: blk_update_request: I/O error, dev sdb, sector 1953507576 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Mar 24 08:42:06 proxhost kernel: sd 2:0:2:0: [sdc] tag#16 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:42:06 proxhost kernel: sd 2:0:2:0: [sdc] tag#16 CDB: Read(10) 28 00 00 00 00 00 00 01 00 00
Mar 24 08:42:06 proxhost kernel: blk_update_request: I/O error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 31 prio class 0
Mar 24 08:42:06 proxhost kernel: sd 2:0:2:0: [sdc] tag#17 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:42:06 proxhost kernel: sd 2:0:2:0: [sdc] tag#17 CDB: Read(10) 28 00 00 00 08 00 00 00 f8 00
Mar 24 08:42:06 proxhost kernel: blk_update_request: I/O error, dev sdc, sector 2048 op 0x0:(READ) flags 0x4000 phys_seg 31 prio class 0
Mar 24 08:42:06 proxhost kernel: sd 2:0:2:0: [sdc] tag#18 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:42:06 proxhost kernel: sd 2:0:2:0: [sdc] tag#18 CDB: Read(10) 28 00 00 00 08 f8 00 00 08 00
Mar 24 08:42:06 proxhost kernel: blk_update_request: I/O error, dev sdc, sector 2296 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Mar 24 08:42:06 proxhost kernel: sd 2:0:2:0: [sdc] tag#19 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
Mar 24 08:42:06 proxhost kernel: sd 2:0:2:0: [sdc] tag#19 CDB: Read(10) 28 00 74 70 28 00 00 00 f8 00
Mar 24 08:42:06 proxhost kernel: blk_update_request: I/O error, dev sdc, sector 1953507328 op 0x0:(READ) flags 0x4000 phys_seg 31 prio class 0
Mar 24 08:42:16 proxhost pvestatd[1303]: zfs error: cannot open 'ProxStore': pool I/O is currently suspended
Mar 24 08:42:26 proxhost pvestatd[1303]: zfs error: cannot open 'ProxStore': pool I/O is currently suspended
Mar 24 08:42:36 proxhost pvestatd[1303]: zfs error: cannot open 'ProxStore': pool I/O is currently suspended
Mar 24 08:42:46 proxhost pvestatd[1303]: zfs error: cannot open 'ProxStore': pool I/O is currently suspended
 
The P420 controllers are not good for ZFS and reportedly autotrim and trim can cause issues with consumer grade ssd's

try to see if you can export the pool before the kernel panic so it doesn't continue to auto mount

then reboot and and try

zpool import -N <poolname>

the pool will not mount but you should be able to run zpool status and hopefully disable auto trim

failing that, try
zpool import -o readonly -R /mnt <poolname>
 
  • Like
Reactions: dratcliffe
I have tried a few times, and the pool won't export before the kernel panic occurs. I believe it may be an issue with the controller itself. Rather than fight with it that much, I've ordered a LSI Logic SAS 9207-8i Storage Controller LSI00301.

My Proxmox OS is installed on a separate drive than those behind the P420, that being said does ZFS allow me to connect all of the drives to the new storage controller and rebuild/import the pool?
 
Hey all,

Thanks for the help. I've installed the LSI Logic SAS 9207-8i in IT mode. The ZFS array came right up, no issues. No kernel panic, back to normal. The fans are going a bit heavier, but that's because of PWM integration not being able to see the drives/manage temps etc. I'll flash BIOS with a custom image to modify fan speeds and we'll be all set.

To those that find this in the future, the real solution is to move away from the P420i and P420 that HP DL380ps often come with. Get a real SAS controller. Lessons have been learned.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!