We have a 9 node cluster.
Now I have two nodes that start sending me SMART mails.
One of which started doing this this morning, after I installed the latest updates and rebooted it yesterday. The other one started logging this about a month ago.
Both of these nodes are PowerEdge R620 machines.
When I check smart status in PX gui its shows as OK on both these boxes.
When i check array status in iDRAC my arrays are healthy.
I'm not quite sure where to go from here, proxmox will mail me daily now.
Syslog regularly logs:
How can I zoom in on this, and above all make sure i'm actually replacing defective disks and this is not some sort of software induced problem?
I have a hard time accepting two actual disk failures in the same sort of timeframe, these disks are from mid 2020 and mid 2018 respectively and somehow neither of them trigger health warnings on the raid controller?
Now I have two nodes that start sending me SMART mails.
One of which started doing this this morning, after I installed the latest updates and rebooted it yesterday. The other one started logging this about a month ago.
Subject: SMART error (Health) detected on host: <node>
Snippet from mail:
The following warning/error was logged by the smartd daemon:
Device: /dev/bus/0 [megaraid_disk_00], SMART Failure: WARNING: ascq=0x4
Both of these nodes are PowerEdge R620 machines.
When I check smart status in PX gui its shows as OK on both these boxes.
When i check array status in iDRAC my arrays are healthy.
I'm not quite sure where to go from here, proxmox will mail me daily now.
Syslog regularly logs:
May 17 13:31:32 px6pve4 smartd[886]: Device: /dev/bus/0 [megaraid_disk_00], SMART Failure: WARNING: ascq=0x4
How can I zoom in on this, and above all make sure i'm actually replacing defective disks and this is not some sort of software induced problem?
I have a hard time accepting two actual disk failures in the same sort of timeframe, these disks are from mid 2020 and mid 2018 respectively and somehow neither of them trigger health warnings on the raid controller?