Every day the same S.M.A.R.T error mail

Feb 5, 2023
83
9
13
Österreich
Hi,

I get every day the same email that 3 sector on my ssd are bad. The SSD had less than 1000 hour.

This message was generated by the smartd daemon running on:

host name: pve
DNS domain: fritz.box

The following warning/error was logged by the smartd daemon:

Device: /dev/sda [SAT], 3 Currently unreadable (pending) sectors

Device info:
P3-1TB, S/N:9P50915005072, WWN:0-000000-000000000, FW:HT5310B3, 1.02 TB

For details see host's SYSLOG.

You can also use the smartctl utility for further investigation.
The original message about this issue was sent at Thu Dec 11 08:19:13 2025 CET
Another message will be sent in 24 hours if the problem persists.
 
Well, even new SSDs can have a fault... What does smartlog say?

Code:
smartctl -a <device>

A long smarttest can be startet with:
Code:
smartctl -t long <device>

You can see the progress with:
Code:
smartctl -a <device> | grep "of test remaining"
 
smartctl -a <device>
Code:
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   050    Old_age   Always       -       0
  5 Reallocated_Sector_Ct   0x0032   100   100   050    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   050    Old_age   Always       -       1031
 12 Power_Cycle_Count       0x0032   100   100   050    Old_age   Always       -       3
160 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       0
161 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       22884
163 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       607
164 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       611
165 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       171
166 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       1
167 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       74
168 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       0
169 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       100
175 Program_Fail_Count_Chip 0x0032   100   100   050    Old_age   Always       -       83886080
176 Erase_Fail_Count_Chip   0x0032   100   100   050    Old_age   Always       -       6794784
177 Wear_Leveling_Count     0x0032   100   100   050    Old_age   Always       -       3
181 Program_Fail_Cnt_Total  0x0032   100   100   050    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   050    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   050    Old_age   Always       -       2
194 Temperature_Celsius     0x0032   100   100   050    Old_age   Always       -       54
195 Hardware_ECC_Recovered  0x0032   100   100   050    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   100   100   050    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   050    Old_age   Always       -       3
198 Offline_Uncorrectable   0x0032   100   100   050    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x0032   100   100   050    Old_age   Always       -       0
232 Available_Reservd_Space 0x0032   100   100   050    Old_age   Always       -       100
241 Total_LBAs_Written      0x0032   100   100   050    Old_age   Always       -       272390
242 Total_LBAs_Read         0x0032   100   100   050    Old_age   Always       -       35998

SMART Error Log Version: 0
No Errors Logged


My main problem is why I get every day the email with the 3 bad sectors?
 
  • Like
Reactions: fireon
I have the same issue and have been looking into what to do. In my case the value is 5.

The article linked above seems to discuss a situation where certain drives can flip the value between 0 and 1 "randomly" which doesnt really match a situation where there are actual bad sectors...
I would prefer to somehow increase the reporting threshold to the current value while I get around to replacing the drive....
 
After replacing several drives last month we received a daily email about a SMART failure on one of the drives that had been removed (I assume, triggered by the hot removal). I realized it was for the same single error in the log which was obviously not recurring since the drive no longer existed. Note the "Dec 11" date in your email.

If I remember correctly the fix was simply to systemctl restart smartd, at least for our case.