No email notification for zfs status degraded

I configured Postfix to send emails.
https://www.alldiscoveries.com/conf...th-gmail-or-another-email-server-via-postfix/
In my case, the error was the lack of aliases.db
check the postfix logs with the command:
journalctl -u postfix*
"Feb 13 10:51:18 pve-4 postfix/local[2606]: error: open database /etc/aliases.db: No such file or directory
Feb 13 10:51:18 pve-4 postfix/local[2606]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Feb 13 10:51:18 pve-4 postfix/local[2606]: warning: hash:/etc/aliases: lookup of 'error' failed".
Or simply look at the presence of the "/etc/aliases.db" file.
I don't know why but in single installations of the latest version of Proxmox VE 8.1-2, without inserting them into the cluster (I haven't tried yet) the automatic i is not generated
run the command manually:
newaliases
and restart postfix:
systemctl restart postfix
 
Last edited:
I have "mixed results" ... some ZFS notifications work, some not. Here's what I tested:

Preparation
  • Using Proxmox VE 7.3-3
  • Ran "apt install mailutils" (as per the above suggestion)
  • Created a ZFS pool "local-zfs" with 3 disks using the PVE GUI
  • Migrated a VM disk to the pool (just to have some data there)
  • Tested the below 3 scenarios, all of which end in a degraded pool
Scenario 1 (working)
  • Command "zpool offline -f local-zfs ata-QEMU_HARDDISK_QM00005"
    --> Email with subject "ZFS device fault for pool 0xBDE81065A6D18BCB on pve-vm1"
  • Command "zpool clear local-zfs"
    --> Email with subject "ZFS resilver_finish event for local-zfs on pve-vm1"
Scenario 2 (not working)
  • - Command "zpool offline local-zfs ata-QEMU_HARDDISK_QM00005"
    --> No email (even though pool shows as "degraded")
  • Command "zpool online local-zfs ata-QEMU_HARDDISK_QM00005"
    --> Email with subject "ZFS resilver_finish event for local-zfs on pve-vm1"
Scenario 3 (not working)
  • Shut down PVE node
  • Unplug one of 3 hard disks of the pool
  • Start the PVE node and modified some data on degraded pool (to force resilvering)
    --> No email (even though pool shows degraded)
  • Shut down PVE node
  • Replug the unplugged hard disk
  • Start the PVE node
  • Command "zpool status" shows "scan: resilvered 464K in 00:00:00 with 0 errors on <timestamp of just now>"
    --> No email (even though resilvering finished for the pool as in scenario 1 and 2)
Conclusions
  • Device failures (done with zpool offline -f) do trigger a mail alert --> Good!
  • The fact that a pool is degraded does not trigger an alert
  • Missing member disks of a pool (e.g. after a reboot) do not trigger an alert
  • Resilvering completed in very short time right after a reboot does not trigger an alert
Question: Any ideas on how to solve that inconsistent behaviour?

Two years later...

Did you ever get this to work?

Does anyone have a working ZED where it notifies when a Disk is UNAVAIL.
 
I have "mixed results" ... some ZFS notifications work, some not. Here's what I tested:

Preparation
  • Using Proxmox VE 7.3-3
  • Ran "apt install mailutils" (as per the above suggestion)
  • Created a ZFS pool "local-zfs" with 3 disks using the PVE GUI
  • Migrated a VM disk to the pool (just to have some data there)
  • Tested the below 3 scenarios, all of which end in a degraded pool
Scenario 1 (working)
  • Command "zpool offline -f local-zfs ata-QEMU_HARDDISK_QM00005"
    --> Email with subject "ZFS device fault for pool 0xBDE81065A6D18BCB on pve-vm1"
  • Command "zpool clear local-zfs"
    --> Email with subject "ZFS resilver_finish event for local-zfs on pve-vm1"
Scenario 2 (not working)
  • - Command "zpool offline local-zfs ata-QEMU_HARDDISK_QM00005"
    --> No email (even though pool shows as "degraded")
  • Command "zpool online local-zfs ata-QEMU_HARDDISK_QM00005"
    --> Email with subject "ZFS resilver_finish event for local-zfs on pve-vm1"
Scenario 3 (not working)
  • Shut down PVE node
  • Unplug one of 3 hard disks of the pool
  • Start the PVE node and modified some data on degraded pool (to force resilvering)
    --> No email (even though pool shows degraded)
  • Shut down PVE node
  • Replug the unplugged hard disk
  • Start the PVE node
  • Command "zpool status" shows "scan: resilvered 464K in 00:00:00 with 0 errors on <timestamp of just now>"
    --> No email (even though resilvering finished for the pool as in scenario 1 and 2)
Conclusions
  • Device failures (done with zpool offline -f) do trigger a mail alert --> Good!
  • The fact that a pool is degraded does not trigger an alert
  • Missing member disks of a pool (e.g. after a reboot) do not trigger an alert
  • Resilvering completed in very short time right after a reboot does not trigger an alert
Question: Any ideas on how to solve that inconsistent behaviour?
It took me a couple of days to figure this one out.

Notifications work on state change of a disk vdev and state change can't be detected when server is powered down because ZED is not running :)

offlining a disk someone here explained is an intended action by an admin so it's not relevant.

But yeah it works as intended.