Smartd: How to stop notifications for a given ID?

MathieuMD

Member
Jun 2, 2022
3
0
6
We have two HPE servers with moderately old disks (Power_On_Hours > 10K hours) which are reporting Offline uncorrectable sectors (Smart ID 198). Therefore, we get daily notifications by email (SMART error (OfflineUncorrectableSector) detected on host:).

Now that we know these disks are old, we don't want to continue to receive daily notification for this specific ID.

According to smartd.conf's manual, options -i and -I should be used:
-i ID [ATA only] Ignore device Attribute number ID when checking for failure of
Usage Attributes. ID must be a decimal integer in the range from 1 to 255.
This Directive modifies the behavior of the '-f' Directive and has no ef‐
fect without it.

This is useful, for example, if you have a very old disk and don't want to
keep getting messages
about the hours-on-lifetime Attribute (usually Attri‐
bute 9) failing. This Directive may appear multiple times for a single de‐
vice, if you want to ignore multiple Attributes.

-I ID [ATA only] Ignore device Attribute ID when tracking changes in the Attri‐
bute values. ID must be a decimal integer in the range from 1 to 255.
This Directive modifies the behavior of the '-p', '-u', and '-t' tracking
Directives and has no effect without one of them.

This is useful, for example, if one of the device Attributes is the disk
temperature (usually Attribute 194 or 231). It's annoying to get reports
each time the temperature changes. This Directive may appear multiple
times for a single device, if you want to ignore multiple Attributes.

Here is our smartd.conf (we tried both -I and -i):
Code:
DEFAULT -n standby -m root -M exec /usr/share/smartmontools/smartd-runner
/dev/sda -i 198
/dev/sdb -i 198
/dev/sdc -i 198
/dev/sdd -i 198
/dev/sde -i 198
/dev/sdf -i 198
/dev/sdg -i 198
DEVICESCAN -d removable

But we still get notified every 24h!

I saw other have the same problem: SMART error e-mails on PAST temperature issue

Did anybody happen to make this working?
 
We have two HPE servers with moderately old disks (Power_On_Hours > 10K hours) which are reporting Offline uncorrectable sectors (Smart ID 198). Therefore, we get daily notifications by email (SMART error (OfflineUncorrectableSector) detected on host:).

Now that we know these disks are old, we don't want to continue to receive daily notification for this specific ID.

According to smartd.conf's manual, options -i and -I should be used:


Here is our smartd.conf (we tried both -I and -i):
Code:
DEFAULT -n standby -m root -M exec /usr/share/smartmontools/smartd-runner
/dev/sda -i 198
/dev/sdb -i 198
/dev/sdc -i 198
/dev/sdd -i 198
/dev/sde -i 198
/dev/sdf -i 198
/dev/sdg -i 198
DEVICESCAN -d removable

But we still get notified every 24h!

I saw other have the same problem: SMART error e-mails on PAST temperature issue

Did anybody happen to make this working?

According to the manual you quoted,
"This Directive modifies the behavior of the '-f' Directive and has no effect without it."
also
"This Directive modifies the behavior of the '-p', '-u', and '-t' tracking
Directives and has no effect without one of them."

This post seems to support that you need to add additional directives
https://superuser.com/a/1358119
 
Thank you @wpaynter!

Actually, instead, I ended up finding that -M once option is exactly what we wanted:

-M TYPE
These Directives modify the behavior of the smartd email warnings enabled
with the '-m' email Directive described above. These '-M' Directives only
work in conjunction with the '-m' Directive and can not be used without it.

Multiple -M Directives may be given. If more than one of the following
three -M Directives are given (example: -M once -M daily) then the final one
(in the example, -M daily) is used.

The valid arguments to the -M Directive are (one of the following three):

once - send only one warning email for each type of disk problem detected.
This is the default unless state persistence ('-s' option) is enabled.

daily - send additional warning reminder emails, once per day, for each type
of disk problem detected. This is the default if state persistence ('-s'
option) is enabled.

diminishing - send additional warning reminder emails, after a one-day in‐
terval, then a two-day interval, then a four-day interval, and so on for
each type of disk problem detected. Each interval is twice as long as the
previous interval.

If a disk problem is no longer detected, the internal email counter is re‐
set. If the problem reappears a new warning email is sent immediately.

Code:
DEFAULT -n standby -m root -M exec /usr/share/smartmontools/smartd-runner
/dev/sda -M once
/dev/sdb -M once
/dev/sdc -M once
/dev/sdd -M once
/dev/sde -M once
/dev/sdf -M once
/dev/sdg -M once
DEVICESCAN -d removable