rpool errors but system boots (rpool has detected an unrecoverable IO failure)

dastrix80

New Member
Oct 25, 2025
18
1
3
Hi All,

Im using latest proxmox on 2 x Intel m2/nvme SSD Drives (brand new, 0% usage). This is for the system only - Mirrored/ZFS

I also have 2 x Dell SAS SSDs for VM storage - Mirrored/ZFS

System was all working nicely BEFORE I put the machine into a new case. I moved cards around and potentially, I also moved the VM_Storage pool to a HBA instead of the onboard SAS ports - I can't recall.

The system gets a unrecoverable IO error when I have all 4 disks in - that is VM_Storage' and the system boot storage. When I remove the VM_Storage disks which as you can guess, it has my VMs on it, the system boots and runs just fine.

Now.....

If I boot the system without the 2 x VM_Storage disks, then insert only 1 disk once proxmox is up and running and run the zfs command I see this - as you'd expect, given only 1 of the disks is installed.
Code:
root@Proxmox:~# zpool status -v
  pool: VM_Storage
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
  scan: resilvered 1.30M in 00:00:00 with 0 errors on Tue Dec 16 12:56:08 2025
config:

        NAME                        STATE     READ WRITE CKSUM
        VM_Storage                  DEGRADED     0     0     0
          mirror-0                  DEGRADED     0     0     0
            scsi-35002538a48b2ad10  ONLINE       0     0     0
            15463254570385615369    UNAVAIL      0     0     0  was /dev/disk/by-id/scsi-35002538a48b2ad30-part1

errors: No known data errors

  pool: rpool
 state: ONLINE
  scan: resilvered 54.2M in 00:00:00 with 0 errors on Tue Dec 16 13:06:27 2025
config:

        NAME                                                STATE     READ WRITE CKSUM
        rpool                                               ONLINE       0     0     0
          mirror-0                                          ONLINE       0     0     0
            ata-INTEL_SSDSCKKB480G8_BTYH12120Q93480K-part3  ONLINE       0     0     0
            ata-INTEL_SSDSCKKB480G8_BTYH12120U9X480K-part3  ONLINE       0     0     0

errors: No known data errors
root@Proxmox:~#


I give it 5-10mins to stablise and insert the 2nd disk and run again

Code:
root@Proxmox:~# zpool status -v
  pool: VM_Storage
 state: ONLINE
  scan: resilvered 832K in 00:00:00 with 0 errors on Tue Dec 16 13:28:05 2025
config:

        NAME                        STATE     READ WRITE CKSUM
        VM_Storage                  ONLINE       0     0     0
          mirror-0                  ONLINE       0     0     0
            scsi-35002538a48b2ad10  ONLINE       0     0     0
            scsi-35002538a48b2ad30  ONLINE       0     0     0

errors: No known data errors

  pool: rpool
 state: ONLINE
  scan: resilvered 54.2M in 00:00:00 with 0 errors on Tue Dec 16 13:06:27 2025
config:

        NAME                                                STATE     READ WRITE CKSUM
        rpool                                               ONLINE       0     0     0
          mirror-0                                          ONLINE       0     0     0
            ata-INTEL_SSDSCKKB480G8_BTYH12120Q93480K-part3  ONLINE       0     0     0
            ata-INTEL_SSDSCKKB480G8_BTYH12120U9X480K-part3  ONLINE       0     0     0

errors: No known data errors
root@Proxmox:~#


All looks great. But then I reboot and I get the rpool has detected an unrecoverable IO failure. The IO errors it was complaining about related to the rpool, not the VM_Storage oddly enough

Once the its booted and the system is running you can see the error and a zfs rpool clear doesnt fix the issue


Code:
  pool: rpool
 state: SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-HC
  scan: resilvered 54.2M in 00:00:00 with 0 errors on Tue Dec 16 13:06:27 2025
config:

        NAME                                                STATE     READ WRITE CKSUM
        rpool                                               DEGRADED     0     0     0
          mirror-0                                          DEGRADED     3    16     0
            ata-INTEL_SSDSCKKB480G8_BTYH12120Q93480K-part3  REMOVED      0     0     0
            ata-INTEL_SSDSCKKB480G8_BTYH12120U9X480K-part3  ONLINE       3    16     0

errors: List of errors unavailable: pool I/O is currently suspended

Hopefully someone can help!

Thanks
 
Last edited:
Update: I moved the HBA to another slot and its all working now. Odd. For future, is there some reference to the card in config I need to know about?
 
  • Like
Reactions: Kingneutron