No email notification for zfs status degraded

rafsaf · Dec 4, 2023

https://pve.proxmox.com/wiki/Roadmap#Improved_management_for_Proxmox_VE_clusters

With Proxmox VE 8.1 there was added support for notifications via SMTP and gotify and not only postfix as before. If anybody is still around, has free time and is in position to test scenarios from the above with new options, please feel free to share results.

I can test it myself closer to the end of month

Pietro395 · Dec 31, 2023

On Proxmox 8.1 ZFS notifications are enabled by default?
O i need to edit the zed config file with my email?

Darkk · Jan 7, 2024

Pietro395 said:
On Proxmox 8.1 ZFS notifications are enabled by default?
O i need to edit the zed config file with my email?

I wonder does it now report if the ZFS pool has degraded? Not just when it failed?

thelogh · Mar 20, 2024

I configured Postfix to send emails.
https://www.alldiscoveries.com/conf...th-gmail-or-another-email-server-via-postfix/
In my case, the error was the lack of aliases.db
check the postfix logs with the command:
journalctl -u postfix*
"Feb 13 10:51:18 pve-4 postfix/local[2606]: error: open database /etc/aliases.db: No such file or directory
Feb 13 10:51:18 pve-4 postfix/local[2606]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Feb 13 10:51:18 pve-4 postfix/local[2606]: warning: hash:/etc/aliases: lookup of 'error' failed".
Or simply look at the presence of the "/etc/aliases.db" file.
I don't know why but in single installations of the latest version of Proxmox VE 8.1-2, without inserting them into the cluster (I haven't tried yet) the automatic i is not generated
run the command manually:
newaliases
and restart postfix:
systemctl restart postfix

gogoFC · Oct 7, 2024

Madozu said:
I have "mixed results" ... some ZFS notifications work, some not. Here's what I tested:

Preparation

Using Proxmox VE 7.3-3

Ran "apt install mailutils" (as per the above suggestion)

Created a ZFS pool "local-zfs" with 3 disks using the PVE GUI

Migrated a VM disk to the pool (just to have some data there)

Tested the below 3 scenarios, all of which end in a degraded pool

Scenario 1 (working)

Command "zpool offline -f local-zfs ata-QEMU_HARDDISK_QM00005"
--> Email with subject "ZFS device fault for pool 0xBDE81065A6D18BCB on pve-vm1"

Command "zpool clear local-zfs"
--> Email with subject "ZFS resilver_finish event for local-zfs on pve-vm1"

Scenario 2 (not working)

- Command "zpool offline local-zfs ata-QEMU_HARDDISK_QM00005"
--> No email (even though pool shows as "degraded")

Command "zpool online local-zfs ata-QEMU_HARDDISK_QM00005"
--> Email with subject "ZFS resilver_finish event for local-zfs on pve-vm1"

Scenario 3 (not working)

Shut down PVE node

Unplug one of 3 hard disks of the pool

Start the PVE node and modified some data on degraded pool (to force resilvering)
--> No email (even though pool shows degraded)

Shut down PVE node

Replug the unplugged hard disk

Start the PVE node

Command "zpool status" shows "scan: resilvered 464K in 00:00:00 with 0 errors on <timestamp of just now>"
--> No email (even though resilvering finished for the pool as in scenario 1 and 2)

Conclusions

Device failures (done with zpool offline -f) do trigger a mail alert --> Good!

The fact that a pool is degraded does not trigger an alert

Missing member disks of a pool (e.g. after a reboot) do not trigger an alert

Resilvering completed in very short time right after a reboot does not trigger an alert

Question: Any ideas on how to solve that inconsistent behaviour?

Two years later...

Did you ever get this to work?

Does anyone have a working ZED where it notifies when a Disk is UNAVAIL.

gogoFC · Oct 8, 2024

Madozu said:
I have "mixed results" ... some ZFS notifications work, some not. Here's what I tested:

Preparation

Using Proxmox VE 7.3-3

Ran "apt install mailutils" (as per the above suggestion)

Created a ZFS pool "local-zfs" with 3 disks using the PVE GUI

Migrated a VM disk to the pool (just to have some data there)

Tested the below 3 scenarios, all of which end in a degraded pool

Scenario 1 (working)

Command "zpool offline -f local-zfs ata-QEMU_HARDDISK_QM00005"
--> Email with subject "ZFS device fault for pool 0xBDE81065A6D18BCB on pve-vm1"

Command "zpool clear local-zfs"
--> Email with subject "ZFS resilver_finish event for local-zfs on pve-vm1"

Scenario 2 (not working)

- Command "zpool offline local-zfs ata-QEMU_HARDDISK_QM00005"
--> No email (even though pool shows as "degraded")

Command "zpool online local-zfs ata-QEMU_HARDDISK_QM00005"
--> Email with subject "ZFS resilver_finish event for local-zfs on pve-vm1"

Scenario 3 (not working)

Shut down PVE node

Unplug one of 3 hard disks of the pool

Start the PVE node and modified some data on degraded pool (to force resilvering)
--> No email (even though pool shows degraded)

Shut down PVE node

Replug the unplugged hard disk

Start the PVE node

Command "zpool status" shows "scan: resilvered 464K in 00:00:00 with 0 errors on <timestamp of just now>"
--> No email (even though resilvering finished for the pool as in scenario 1 and 2)

Conclusions

Device failures (done with zpool offline -f) do trigger a mail alert --> Good!

The fact that a pool is degraded does not trigger an alert

Missing member disks of a pool (e.g. after a reboot) do not trigger an alert

Resilvering completed in very short time right after a reboot does not trigger an alert

Question: Any ideas on how to solve that inconsistent behaviour?

It took me a couple of days to figure this one out.

Notifications work on state change of a disk vdev and state change can't be detected when server is powered down because ZED is not running

offlining a disk someone here explained is an intended action by an admin so it's not relevant.

But yeah it works as intended.

Search

Search

No email notification for zfs status degraded

rafsaf

Member

Pietro395

Member

Darkk

Well-Known Member

thelogh

New Member

gogoFC

New Member

gogoFC

New Member

We value your privacy