Single drive Proxmox host failing with ZFS unrecoverable I/O error

crawforc3

New Member
Apr 6, 2023
12
1
3
I have a Proxmox host using a single SSD and a ZFS filesystem. The zpool was created when installing proxmox (v7) and the pool name is `rpool`. Yesterday, the Proxmox host failed and I’m not sure how to troubleshoot it.

On boot, there was an error saying something about `encountered an unrecoverable I/O error and rpool has been disabled` (Paraphrasing a bit, because I’m on my phone and the system didn't stop at this error).

I pulled the drive and made a raw copy on another system using dd. I was able to mount the drive externally via USB, import the pool, and also mount/import the raw image without any issues. `zfs status` says the pool is online for both.

It’s been difficult to troubleshoot because I can’t get a prompt on the host machine. I’m a ZFS newb and not sure what to do next.

I have a spare SSD I can use if that helps.
 
Run a scrub on the pool to see the data integrity. Also the smart status could tell you about the drive health. In general though, I'd say mirror the drive to a healthy one. Either with some imaging tool or on a system where you can import the pool and have zfs create a mirror of your single drive. Since it is a rpool you will need to recreate the partitions/bootloader see the pve-docs for that [0] and zpool attach [1] if you choose to use zfs to mirror the drive (you can also find more information here on the forum).

[0] https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#_zfs_administration
[1] https://openzfs.github.io/openzfs-docs/man/master/8/zpool-attach.8.html
 
  • Like
Reactions: Kingneutron
Please check the system logs for any event that might point to the disk being defective. Did you run a SMARTD check?

Also note that for BTRFS and ZFS it is a better practice (faster and safer) to use their respective send and receive commands when making a copy of a disk.

While there are benefits to use a single ZFS disk like knowing if data was corrupted, it is generally recommended to have mirroring or raid1 to truly protect your data.
 
  • Like
Reactions: Kingneutron