Email Notifications

Kha0sK1d

New Member
Oct 16, 2019
3
0
1
Is there a list of email notifications Proxmox sends out? Are they sent periodically or only sent when something is wrong? I've confirmed I can send email from the commandline and received it on the destination address.
 

alc

Member
Feb 18, 2020
11
2
8
37
I just pulled the sata cable from 'sda' on a fresh ZFS RAID1 testing box, and it doesn't seem to send out any message, although 'zpool status' gives DEGRADED.

But the box still runs smoothly, hooray for the RAID/ZFS :D

EDIT : After re-connecting the HDD and re-adding it to the pool ('zpool online rpool <dev>'), I've got a message announcing the resilver_finish event. But clearly the 'degraded' event would be critial to get !
 
Last edited:

LnxBil

Famous Member
Feb 21, 2015
5,961
727
133
Germany
But clearly the 'degraded' event would be critial to get !

There is no other way than monitor it yourself with proper tools. You can never rely on "on the host monitoring", it always should be from a remote location.
 

AngryAdm

Member
Sep 5, 2020
145
12
18
91
There is no other way than monitor it yourself with proper tools. You can never rely on "on the host monitoring", it always should be from a remote location.

This is absolutely silly behaviour of zed and accepting it as "okay" is a bad idea. This should be changed. "it should always be monitored from remote location.." yeah right! The software is in proxmox and why would one invest in a 2nd server to monitor a hardrive in the first server when all the firstserver has to do is send out the proper email when a drive is unplugged. Instead of doing it after someone plugged it back in... because obviously if someone plugs it back in he will check zpool status untill its done resilvering and the email is quite redundant..


I just tried the same, rip out cable, wait for notification email of bad drive... NOTHING
Few seconds after plugging it back... Drive resilver finished email appears...


Sep 10 13:25:15 fs02 kernel: ata9: SATA link down (SStatus 0 SControl 300)
Sep 10 13:25:20 fs02 kernel: ata9: SATA link down (SStatus 0 SControl 300)
Sep 10 13:25:26 fs02 kernel: ata9: SATA link down (SStatus 0 SControl 300)
Sep 10 13:25:26 fs02 kernel: ata9.00: disabled
Sep 10 13:25:26 fs02 kernel: ata9.00: detaching (SCSI 8:0:0:0)
Sep 10 13:25:26 fs02 kernel: sd 8:0:0:0: [sdg] Synchronizing SCSI cache
Sep 10 13:25:26 fs02 kernel: sd 8:0:0:0: [sdg] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep 10 13:25:26 fs02 kernel: sd 8:0:0:0: [sdg] Stopping disk
Sep 10 13:25:26 fs02 kernel: sd 8:0:0:0: [sdg] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Sep 10 13:25:26 fs02 zed[29838]: eid=51 class=statechange pool_guid=0xD6BEA68E732464BF vdev_path=/dev/disk/by-id/ata-ST1000DM003-1CH162_S1D8PXZJ-part1 vdev_state=OFFLINE
Sep 10 13:25:27 fs02 zed[29970]: eid=52 class=config_sync pool_guid=0xD6BEA68E732464BF
Sep 10 13:25:48 fs02 kernel: ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Sep 10 13:25:48 fs02 kernel: ata9.00: ATA-8: ST1000DM003-1CH162, CC46, max UDMA/133
Sep 10 13:25:48 fs02 kernel: ata9.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 32), AA
Sep 10 13:25:48 fs02 kernel: ata9.00: configured for UDMA/133
Sep 10 13:25:48 fs02 kernel: scsi 8:0:0:0: Direct-Access ATA ST1000DM003-1CH1 CC46 PQ: 0 ANSI: 5
Sep 10 13:25:48 fs02 kernel: sd 8:0:0:0: Attached scsi generic sg6 type 0
Sep 10 13:25:48 fs02 kernel: sd 8:0:0:0: [sdg] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
Sep 10 13:25:48 fs02 kernel: sd 8:0:0:0: [sdg] 4096-byte physical blocks
Sep 10 13:25:48 fs02 kernel: sd 8:0:0:0: [sdg] Write Protect is off
Sep 10 13:25:48 fs02 kernel: sd 8:0:0:0: [sdg] Mode Sense: 00 3a 00 00
Sep 10 13:25:48 fs02 kernel: sd 8:0:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep 10 13:25:48 fs02 kernel: sdg: sdg1 sdg9
Sep 10 13:25:48 fs02 kernel: sd 8:0:0:0: [sdg] Attached SCSI disk
Sep 10 13:25:49 fs02 zed[30529]: eid=53 class=statechange pool_guid=0xD6BEA68E732464BF vdev_path=/dev/disk/by-id/ata-ST1000DM003-1CH162_S1D8PXZJ-part1 vdev_state=ONLINE
Sep 10 13:25:49 fs02 zed[30642]: eid=54 class=vdev_online pool_guid=0xD6BEA68E732464BF vdev_path=/dev/disk/by-id/ata-ST1000DM003-1CH162_S1D8PXZJ-part1 vdev_state=ONLINE
Sep 10 13:25:50 fs02 zed[30794]: eid=55 class=resilver_start pool_guid=0xD6BEA68E732464BF
Sep 10 13:25:50 fs02 zed[30815]: eid=56 class=history_event pool_guid=0xD6BEA68E732464BF
Sep 10 13:25:51 fs02 zed[31079]: eid=57 class=history_event pool_guid=0xD6BEA68E732464BF
Sep 10 13:25:51 fs02 zed[31094]: eid=58 class=resilver_finish pool_guid=0xD6BEA68E732464BF
Sep 10 13:25:51 fs02 postfix/pickup[20422]: BD2F566123: uid=0 from=<root>
Sep 10 13:25:51 fs02 postfix/cleanup[29598]: BD2F566123: message-id=<20200910112551.BD2F566123@fs02.winwinsol.local>
Sep 10 13:25:51 fs02 postfix/qmgr[20423]: BD2F566123: from=<root@fs02.winwinsol.local>, size=1673, nrcpt=1 (queue active)
Sep 10 13:25:51 fs02 zed[31109]: eid=59 class=config_sync pool_guid=0xD6BEA68E732464BF
Sep 10 13:25:51 fs02 postfix/smtp[29601]: connect to smtp.gmail.com[2a00:1450:4013:c03::6d]:587: Network is unreachable
Sep 10 13:25:53 fs02 postfix/smtp[29601]: BD2F566123: to=<XXXXXX@gmail.com>, relay=smtp.gmail.com[74.125.143.109]:587, delay=1.5, delays=0/0/0.61/0.9, dsn=2.0.0, status=sent (250 2.0.0 OK 1599737153 la17sm6701717ejb.62 - gsmtp)
Sep 10 13:25:53 fs02 postfix/qmgr[20423]: BD2F566123: removed
 
  • Like
Reactions: alc

apoc

Renowned Member
Oct 13, 2017
853
105
63
I have recently received a "disk has failed" event from ZED. Looked like this:

The number of I/O errors associated with a ZFS device exceeded
acceptable levels. ZFS has marked the device as faulted.

impact: Fault tolerance of the pool may be compromised.
eid: 190
class: statechange
state: FAULTED
host: file
time: 2020-12-11 19:54:45+0100
vpath: /dev/disk/by-vdev/C0-S4-part1
vphys: pci-0000:01:00.0-sas-phy0-lun-0
vguid: 0xD924681D17354751
devid: scsi-35000039a48d30040-part1
pool: 0x8DF3E60A74E576F0


So I wouldn't blame ZED per se...
But I have to admit that was a different type of failure (have not disconnected the data port)
 

NDev

New Member
Dec 19, 2020
4
0
1
45
Mine didnt fail yet.
Im just wondering if it would notify me at all if it does. ;-)
Because it would be really bad if it doesn't.

Regards, ND
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!