Mirrored zpool Proxmox stops when one disk fails

Dacesilian · Sep 6, 2024

Hello, I have a 2-disk mirror ZFS pool for Proxmox (as root partition). When one of these 2 disks failed, my system has shut down all VMs and stayed in some emergency mode. Is it possible to setup, that on zpool disk fail, system stays running? Thank you.

Code:

impact: Fault tolerance of the pool may be compromised.
    eid: 3460445
  class: statechange
  state: FAULTED
   host: my.host.com
   time: 2024-09-..
  vpath: /dev/disk/by-id/ata-...
  vguid: 0x6...
   pool: rpool (0x1C...)

Code:

        NAME                                                      STATE     READ WRITE CKSUM
        rpool                                                     ONLINE       0     0     0
          mirror-0                                                ONLINE       0     0     0
            ata-Samsung...-partX                                     ONLINE       0     0     0
            ata-Samsung...-partX                                  ONLINE       0     0     0

leesteken · Sep 6, 2024

On a fresh installation Proxmox does not shut down VMs and stop running when one side of a ZFS mirror fails.
What caused the (temporary?) drive failure? Maybe it also caused a failure of the other drive (which could not be written to the system log because it was the last drive)? Maybe it was the SATA controller to which both drives are connected? Are you using QLC drives that can cause a ZFS failure because they write too slowly (which could easily happens on both drives at the same time because of the mirror)

Dacesilian · Sep 6, 2024

There was a failure, I've changed the drive for another (different model) and it failed after one month - at the same drive bay, so I think it's not the drive, but controller/SATA cable. It's Supermicro case and indicater on drive caddy was steady green. Now I moved it to another slot/controller and I will see - drive is working fine.

But server really stopped all VMs. It sent me an e-mail, then waited for a few minutes and after disk timeout, it started shutting down services.

Why vdev is FAULTED, shouldn't it be DEGRADED and only drive FAULTED?

Code:

kernel: [3113404.038030] ata1: SATA link down (SStatus 0 SControl 3F0)
kernel: [3113404.038039] ata1.00: disable device
kernel: [3113404.038058] ata1.00: detaching (SCSI 0:0:0:0)
kernel: [3113404.038084] device offline error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0

2024-09-05 19:27:50.473271 kernel: [3113404.038678] zio pool=rpool vdev=/dev/disk/by-id/ata-...-part3 error=5 type=5 offset=0 size=0 flags=1049728
2024-09-05 19:27:50.473278 kernel: [3113404.038708] device offline error, dev sda, sector 162259880 op 0x1:(WRITE) flags 0x0 phys_seg 3 prio class 0
2024-09-05 19:27:50.474398 kernel: [3113404.038875] device offline error, dev sda, sector 1051040 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
2024-09-05 19:27:50.474403 kernel: [3113404.039276] zio pool=rpool vdev=/dev/disk/by-id/ata-...-part3 error=5 type=2 offset=82539139072 size=4096 flags=1572992
2024-09-05 19:27:50.474407 kernel: [3113404.039863] zio pool=rpool vdev=/dev/disk/by-id/ata-...-part3 error=5 type=2 offset=82539143168 size=4096 flags=1572992
2024-09-05 19:27:50.474411 kernel: [3113404.039868] zio pool=rpool vdev=/dev/disk/by-id/ata-...-part3 error=5 type=2 offset=82539147264 size=4096 flags=1572992
2024-09-05 19:27:50.475158 kernel: [3113404.039871] zio pool=rpool vdev=/dev/disk/by-id/ata-...-part3 error=5 type=2 offset=212992 size=4096 flags=1573568
2024-09-05 19:27:50.475164 kernel: [3113404.039893] device offline error, dev sda, sector 1051152 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
2024-09-05 19:27:50.475164 kernel: [3113404.039901] device offline error, dev sda, sector 1051552 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0

zed: eid=3460967 class=io pool='rpool' vdev=ata-...-part3 size=8192 offset=270336 priority=0 err=5 flags=0xb00c1
2024-09-05 19:27:50.483019 zed: eid=3460968 class=io pool='rpool' vdev=ata-...-part3 size=8192 offset=249520726016 priority=0 err=5 flags=0xb00c1
2024-09-05 19:27:50.483308 zed: eid=3460969 class=io pool='rpool' vdev=ata-...-part3 size=8192 offset=249520988160 priority=0 err=5 flags=0xb00c1
2024-09-05 19:27:50.483452 zed: eid=3460970 class=probe_failure pool='rpool' vdev=ata-...-part3
2024-09-05 19:27:50.500398 zed: eid=3460971 class=statechange pool='rpool' vdev=ata-...-part3 vdev_state=FAULTED
2024-09-05 19:27:50.508186 kernel: [3113404.072648] sd 0:0:0:0: [sda] Synchronizing SCSI cache
2024-09-05 19:27:50.508200 kernel: [3113404.072707] sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

kernel: [3113404.127823] FAT-fs (sda2): unable to read boot sector to mark fs as dirty

2024-09-05 19:31:31.837669 systemd[1]: dev-disk-by\x2duuid-FEEC\x2d7C53.device: Job dev-disk-by\x2duuid-FEEC\x2d7C53.device/start timed out.
2024-09-05 19:31:31.837834 systemd[1]: Timed out waiting for device dev-disk-by\x2duuid-FEEC\x2d7C53.device - Samsung_SSD_.
2024-09-05 19:31:31.837926 systemd[1]: Dependency failed for systemd-fsck@dev-disk-by\x2duuid-FEEC\x2d7C53.service - File System Check on /dev/disk/by-uuid/FEEC-7C53.
2024-09-05 19:31:31.838014 systemd[1]: Dependency failed for boot-efi.mount - /boot/efi.
2024-09-05 19:31:31.838116 systemd[1]: Dependency failed for local-fs.target - Local File Systems.
2024-09-05 19:31:31.838241 systemd[1]: Dependency failed for sanoid.service - Snapshot ZFS filesystems.

2024-09-05 19:31:31.838321 systemd[1]: sanoid.service: Job sanoid.service/start failed with result 'dependency'.
2024-09-05 19:31:31.838394 systemd[1]: Dependency failed for sanoid-prune.service - Prune ZFS snapshots.
2024-09-05 19:31:31.838466 systemd[1]: sanoid-prune.service: Job sanoid-prune.service/start failed with result 'dependency'.
2024-09-05 19:31:31.838538 systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.
2024-09-05 19:31:31.838599 systemd[1]: local-fs.target: Triggering OnFailure= dependencies.
2024-09-05 19:31:31.839184 systemd[1]: boot-efi.mount: Job boot-efi.mount/start failed with result 'dependency'.
2024-09-05 19:31:31.839268 systemd[1]: systemd-fsck@dev-disk-by\x2duuid-FEEC\x2d7C53.service: Job systemd-fsck@dev-disk-by\x2duuid-FEEC\x2d7C53.service/start failed with result 'dependency'.
2024-09-05 19:31:31.839341 systemd[1]: dev-disk-by\x2duuid-FEEC\x2d7C53.device: Job dev-disk-by\x2duuid-FEEC\x2d7C53.device/start failed with result 'timeout'.
2024-09-05 19:31:31.848887 systemd[1]: systemd-ask-password-console.path: Deactivated successfully.
2024-09-05 19:31:31.850662 systemd[1]: Stopped systemd-ask-password-console.path - Dispatch Password Requests to Console Directory Watch.
2024-09-05 19:31:31.850871 systemd[1]: systemd-ask-password-wall.path: Deactivated successfully.
2024-09-05 19:31:31.851060 systemd[1]: Stopped systemd-ask-password-wall.path - Forward Password Requests to Wall Directory Watch.
2024-09-05 19:31:31.851204 systemd[1]: apt-daily-upgrade.timer: Deactivated successfully.
2024-09-05 19:31:31.851357 systemd[1]: Stopped apt-daily-upgrade.timer - Daily apt upgrade and clean activities.
2024-09-05 19:31:31.851484 systemd[1]: apt-daily.timer: Deactivated successfully.
2024-09-05 19:31:31.851598 systemd[1]: Stopped apt-daily.timer - Daily apt download activities.
2024-09-05 19:31:31.851730 systemd[1]: dpkg-db-backup.timer: Deactivated successfully.
2024-09-05 19:31:31.851834 systemd[1]: Stopped dpkg-db-backup.timer - Daily dpkg database backup timer.
2024-09-05 19:31:31.851965 systemd[1]: e2scrub_all.timer: Deactivated successfully.
2024-09-05 19:31:31.852074 systemd[1]: Stopped e2scrub_all.timer - Periodic ext4 Online Metadata Check for All Filesystems.
2024-09-05 19:31:31.852241 systemd[1]: logrotate.timer: Deactivated successfully.
2024-09-05 19:31:31.852356 systemd[1]: Stopped logrotate.timer - Daily rotation of log files.
2024-09-05 19:31:31.852502 systemd[1]: man-db.timer: Deactivated successfully.
2024-09-05 19:31:31.852615 systemd[1]: Stopped man-db.timer - Daily man-db regeneration.
2024-09-05 19:31:31.852763 systemd[1]: pve-daily-update.timer: Deactivated successfully.
2024-09-05 19:31:31.852887 systemd[1]: Stopped pve-daily-update.timer - Daily PVE download activities.
2024-09-05 19:31:31.853045 systemd[1]: sanoid.timer: Deactivated successfully.
2024-09-05 19:31:31.853225 systemd[1]: Stopped sanoid.timer - Run Sanoid Every 15 Minutes.
2024-09-05 19:31:31.853356 systemd[1]: sysstat-collect.timer: Deactivated successfully.
2024-09-05 19:31:31.853496 systemd[1]: Stopped sysstat-collect.timer - Run system activity accounting tool every 10 minutes.
2024-09-05 19:31:31.853633 systemd[1]: sysstat-summary.timer: Deactivated successfully.
2024-09-05 19:31:31.853741 systemd[1]: Stopped sysstat-summary.timer - Generate summary of yesterday's process accounting.
2024-09-05 19:31:31.853903 systemd[1]: systemd-tmpfiles-clean.timer: Deactivated successfully.
2024-09-05 19:31:31.854024 systemd[1]: Stopped systemd-tmpfiles-clean.timer - Daily Cleanup of Temporary Directories.

leesteken · Sep 6, 2024

Dacesilian said:

Code:

kernel: [3113404.038030] ata1: SATA link down (SStatus 0 SControl 3F0)
kernel: [3113404.038039] ata1.00: disable device
kernel: [3113404.038058] ata1.00: detaching (SCSI 0:0:0:0)
kernel: [3113404.038084] device offline error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0

2024-09-05 19:27:50.473271 kernel: [3113404.038678] zio pool=rpool vdev=/dev/disk/by-id/ata-...-part3 error=5 type=5 offset=0 size=0 flags=1049728
2024-09-05 19:27:50.473278 kernel: [3113404.038708] device offline error, dev sda, sector 162259880 op 0x1:(WRITE) flags 0x0 phys_seg 3 prio class 0

According to the Linux kernel the connection with the drive is lost and the ZFS problem with one of the drives is a consequence of that. This indicates a cable problem or SATA controller problem or power problem or motherboard problem or memory problem or CPU problem (in decreasing order of likeliness).

Dacesilian said:
Why vdev is FAULTED, shouldn't it be DEGRADED and only drive FAULTED?

The pool is DEGRADED (but working) when one of more vdevs (drives) are FAULTED.

Dacesilian said:
But server really stopped all VMs. It sent me an e-mail, then waited for a few minutes and after disk timeout, it started shutting down services.

Dacesilian said:

Code:

2024-09-05 19:31:31.837834 systemd[1]: Timed out waiting for device dev-disk-by\x2duuid-FEEC\x2d7C53.device - Samsung_SSD_.
2024-09-05 19:31:31.837926 systemd[1]: Dependency failed for systemd-fsck@dev-disk-by\x2duuid-FEEC\x2d7C53.service - File System Check on /dev/disk/by-uuid/FEEC-7C53.
2024-09-05 19:31:31.838014 systemd[1]: Dependency failed for boot-efi.mount - /boot/efi.

Looks like systemd decided that the missing ESP partition (which was also on that drivre) is enough reasong to shutdown Proxmox gracefully.

This is really surprising and not at all want I would want from Proxmox. I don't know if your /etc/fstab contains something special or whether this systemd's detects and decides this on their own. Maybe you can report this as a bug at https://bugzilla.proxmox.com/ ?

EDIT: What is the output of systemctl list-units -t mount --all ? My Proxmox does not have boot-efi in that list. What method did you use to install Proxmox?

Dacesilian · Sep 6, 2024

leesteken said:
Looks like systemd decided that the missing ESP partition (which was also on that drivre) is enough reasong to shutdown Proxmox gracefully.

This is really surprising and not at all want I would want from Proxmox.

Yeah, I agree that I want running system and not to shutdown all VMs to wait for boot repair.

That's sad because I've added third drive to mirror, but according to this, if boot partition disappears again, it will fail again

Don't you have any idea, how can I disable this check? Or to have multiple boot partitions? I can clone them and use proxmox-boot-something maybe?

leesteken said:
The pool is DEGRADED (but working) when one of more vdevs (drives) are FAULTED.

Oh, yes. Right. I was thinking that vdev is mirror-0 in my situation. But it says vdev=ata... .

leesteken said:
What is the output

I'm not sure what method I've used, because it's old system - in that time, Proxmox couldn't be installed on ZFS root, I've needed to boot into some live system, compile&modprobe ZFS modules, debootstrap Debian...

Code:

# systemctl list-units -t mount --all
  UNIT                                                              LOAD      ACTIVE   SUB     DESCRIPTION
  -.mount                                                           loaded    active   mounted Root Mount
  boot-efi.mount                                                    loaded    active   mounted /boot/efi

Thank you.

fstab:

Code:

# cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system>         <mount point>   <type>  <options>       <dump>  <pass>
/dev/zvol/rpool/swap     none            swap    defaults        0       0
rpool/var                /var            zfs     defaults        0       0
rpool/var/tmp            /var/tmp        zfs     defaults        0       0
/dev/disk/by-uuid/FEEC-7C53 /boot/efi vfat defaults 0 1

leesteken · Sep 6, 2024

Dacesilian said:

Code:

# cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system>         <mount point>   <type>  <options>       <dump>  <pass>
/dev/zvol/rpool/swap     none            swap    defaults        0       0
rpool/var                /var            zfs     defaults        0       0
rpool/var/tmp            /var/tmp        zfs     defaults        0       0
/dev/disk/by-uuid/FEEC-7C53 /boot/efi vfat defaults 0 1

There it is! I guess this is not a Proxmox bug as the installer does not add that line. The /var and /var/tmp lines also seem superfluous to me (since rpool is mounted at /).
You can probably fix the shutdown issue by removing /dev/disk/by-uuid/FEEC-7C53 /boot/efi vfat defaults 0 1 from /etc/fstab and run systemctl daemon-reload to let systemd know.

Dacesilian · Sep 6, 2024

leesteken said:
You can probably fix the shutdown issue by removing /dev/disk/by-uuid/FEEC-7C53 /boot/efi vfat defaults 0 1 from /etc/fstab and run systemctl daemon-reload to let systemd know.

Are you sure that I shouldn't have /boot/efi in fstab? Will it boot after I remove it? I don't know in detail, how GRUB/EFI/boot sequence really works, what are the steps, what loads after what. Can I clone efi partition to another disk and will it boot when disk fails? Probably proxmox-boot-tool can help there - https://pve.proxmox.com/wiki/Host_Bootloader

leesteken said:
The /var and /var/tmp lines also seem superfluous to me (since rpool is mounted at /).

This is fine, I have them separated. Not sure, why mountpoint is legacy and not directly zfs mountpoint.

Code:

rpool                                                  33.6G   191G    96K  none
rpool/root                                             10.6G   191G    96K  none
rpool/root/proxmox                                     10.6G   191G  10.5G  /
rpool/swap                                             3.41G   193G  1.28G  -
rpool/tmp                                              47.6M  2.95G   284K  /tmp
rpool/var                                              19.4G   191G  16.7G  legacy
rpool/var/tmp                                          1.78M  3.00G   152K  legacy

This is what I expect to have DEGRADED pool, when one disk fails: https://forum.proxmox.com/threads/how-to-mount-a-zfs-drive-from-promox.37104/#post-182145

RolandK · Sep 6, 2024

what status is the pool in when the disk failed ? (zpool status -v )

Dacesilian · Sep 6, 2024

RolandK said:
what status is the pool in when the disk failed ? (zpool status -v )

I didn't check that, so I don't know.

Okay, I will remove /boot/efi from fstab. I think it makes sense to not have it there, while multiple EFI partitions exist (on each drive).

Search

Search

Mirrored zpool Proxmox stops when one disk fails

Dacesilian

Member

leesteken

Distinguished Member

Dacesilian

Member

leesteken

Distinguished Member

Dacesilian

Member

leesteken

Distinguished Member

Dacesilian

Member

RolandK

Renowned Member

Dacesilian

Member