[feature suggestion] Degraded ZFS pools should be reflected in node state

Keeper of the Keys

Active Member
Jul 7, 2021
62
17
28
Recently one of my PVE machines had a degraded RAIDz2 pool where 2 disks just disappeared, since the redundancy level is 2 I did not notice this at all in day to day work.
The disks returned to being detected with a reboot and resilvered in a few hours while I am mildly curious if there was a bug in one of the recent kernel builds or maybe hardware didn't reinitialize properly during boot the real question I was left with is:

Why wasn't this very dangerous state my machine was in reported in a prominent way?
PVE summary and PDM summary pages all showed nice green checkmarks, I don't recall why I even ended up looking at the disks I think I was playing around with something completely unrelated and realized I was missing disks.

Obviously I should spend some time and setup mail alerts too however to me it seems that a system that has a degraded pool should show warning signs on the summary pages/PDM too and not just buried in the ZFS settings page or zpool status command.
 
I don’t think your feature request will likely be implemented even if you post it on the forum.

Please check the URL below to see if your request already exists. If it does, please add your comment to the existing thread; if not, please submit a new inquiry.

https://bugzilla.proxmox.com/
 
  • Like
Reactions: UdoB
Zed (ZFS event daemon) sends a notification mail when a pool changes to DEGRADED state when properly configured. See:
https://pve.proxmox.com/wiki/ZFS_on_Linux#_configure_e_mail_notification
/etc/zfs/zed.d/statechange-notify.sh
/etc/zfs/zed.d/zed.rc

This is outside of Proxmox though and relies on working system-wide email functionality.

Some related older threads:
https://forum.proxmox.com/threads/no-zfs-degraded-notifications.58783/
https://forum.proxmox.com/threads/no-email-notification-for-zfs-status-degraded.87629/
https://forum.proxmox.com/threads/zfs-notifications.181029/
 
Last edited:
  • Like
Reactions: UdoB
@daanw I know I can have mail alerts, my point is that a pve/pbs node that has a degraded pool or smart errors for that matter should not be showing "all green/OK" in the summary or PDM dashboards, the system is NOT ok and may be teetering on a dangerous cliff, this should be reflected to the user front and center and not burried in submenus/expected to be monitored by the user through unrelated methods.