The other day I got an email from my Proxmox setup on my NUC. It said:
This message was generated by the smartd daemon running on:
host name: pve
DNS domain: home
The following warning/error was logged by the smartd daemon:
Device: /dev/sda [SAT], ATA error count increased from 8 to 18
Device info:
WDC WD40PURZ-85AKKY0, S/N:WD-WX22DC1RNPUK, WWN:5-0014ee-26a545142, FW:80.00B80, 4.00 TB
For details see host's SYSLOG.
You can also use the smartctl utility for further investigation.
The original message about this issue was sent at Wed Oct 4 11:08:51 2023 BST
Another message will be sent in 24 hours if the problem persists.
I didn't pursue this as we had a power failure just previously and I thought it was tied into that . This morning, though, there was a note saying my scheduled backup failed. Syslog said:
Does this mean my HDD is failing? The first email suggested using smartctl utility to investigate. I'm not sure how to access that or even what to look for.
When I look at the drive in Proxmox it shows enabled and active.
Any guidance here would be greatly appreciated. Many thanks.
edit:
I looked at S.M.A.R.T and didn't see any failures with that drive.
This message was generated by the smartd daemon running on:
host name: pve
DNS domain: home
The following warning/error was logged by the smartd daemon:
Device: /dev/sda [SAT], ATA error count increased from 8 to 18
Device info:
WDC WD40PURZ-85AKKY0, S/N:WD-WX22DC1RNPUK, WWN:5-0014ee-26a545142, FW:80.00B80, 4.00 TB
For details see host's SYSLOG.
You can also use the smartctl utility for further investigation.
The original message about this issue was sent at Wed Oct 4 11:08:51 2023 BST
Another message will be sent in 24 hours if the problem persists.
I didn't pursue this as we had a power failure just previously and I thought it was tied into that . This morning, though, there was a note saying my scheduled backup failed. Syslog said:
Code:
Oct 08 01:00:01 pve pvescheduler[801271]: <root@pam> starting task UPID:pve:000C39F8:01D71EC9:6521F101:vzdump:100:root@pam:
Oct 08 01:00:02 pve pvescheduler[801272]: INFO: starting new backup job: vzdump --compress zstd --quiet 1 --storage Storage --mailto tom.husband@gmail.com --mode snapshot --mailnotification always --prune-backups 'keep-last=5' --node pve --all 1 --notes-template '{{guestname}}'
Oct 08 01:00:02 pve pvescheduler[801272]: ERROR: Backup of VM 100 failed - unable to create temporary directory '/mnt/pve/Storage/dump/vzdump-qemu-100-2023_10_08-01_00_02.tmp' at /usr/share/perl5/PVE/VZDump.pm line 947.
Oct 08 01:00:02 pve pvescheduler[801272]: INFO: Backup job finished with errors
Oct 08 01:00:02 pve pvescheduler[801272]: job errors
Oct 08 01:00:02 pve postfix/pickup[794071]: 0C61020294: uid=0 from=<root>
Oct 08 01:00:02 pve postfix/cleanup[801279]: 0C61020294: message-id=<20231008000002.0C61020294@pve.home>
Oct 08 01:00:02 pve postfix/qmgr[879]: 0C61020294: from=<root@pve.home>, size=1903, nrcpt=1 (queue active)
Oct 08 01:00:04 pve postfix/smtp[801281]: 0C61020294: to=<tom.husband@gmail.com>, relay=smtp.gmail.com[142.250.97.108]:587, delay=2.5, delays=0.02/0.01/1.4/1.1, dsn=2.0.0, status=sent (250 2.0.0 OK 1696723204 bq16-20020a056122231000b004961bbadb84sm1182938vkb.7 - gsmtp)
Oct 08 01:00:04 pve postfix/qmgr[879]: 0C61020294: removed
Does this mean my HDD is failing? The first email suggested using smartctl utility to investigate. I'm not sure how to access that or even what to look for.
When I look at the drive in Proxmox it shows enabled and active.
Any guidance here would be greatly appreciated. Many thanks.
edit:
I looked at S.M.A.R.T and didn't see any failures with that drive.
Attachments
Last edited: