I have NVMe disk Samsung PM981a 512 GB (MZVLB512HBJQ)
As far as I understand the manufacturer rated disk endurance is 200 TB and SMART test shows that the disk should be like a FAILED one. But in reality such a disk can handle much more writes, I suppose something near 1000 TB. But Proxmox is sending me every day disk failed notification.
Hetzner's answer about this issue:
So the question is: How to make Proxmox work with "Available Spare Threshold" but not with "Percentage Used" as the last one is useless in case of many NVMe\SSD's?
As far as I understand the manufacturer rated disk endurance is 200 TB and SMART test shows that the disk should be like a FAILED one. But in reality such a disk can handle much more writes, I suppose something near 1000 TB. But Proxmox is sending me every day disk failed notification.
Hetzner's answer about this issue:
Dear Client,
'Critical Warning: 0x04' is caused by "Percentage Used" being above 100%. In its own right, this only indicates that the drive is now out of warranty by the manufacturer. However, as long as 'Available Spare' is greater than 'Available Spare Threshold', you can safely ignore this.
Unfortunately, tools like smartctl will report the disk as failed, so you might need some custom filters for your monitoring.
This topic has been investigated and analyzed with our vendors for a very long time. Unfortunately, it is not possible to disable this warning for our use case. If you insist on it nonetheless, we can offer to replace the SSD for you as a gesture of goodwill.
Thank you very much for your understanding.
So the question is: How to make Proxmox work with "Available Spare Threshold" but not with "Percentage Used" as the last one is useless in case of many NVMe\SSD's?