High Total_Bad_Blocks?

hal9008

Member
Sep 23, 2020
8
0
6
47
Hello.

I have some doubts about the health of one of my disks.

The disk is an SSD WD Red WDS100T1R0A with less than one year old. I have tried that the disk do not have too may frecuent writes. In SMART values i see that "Host_Writes_GiB" only showns "529". It's just a little more than the entire space of the disk (500 Gb), so I didn't write so much in this disk.

In this disk I have the installation of proxmox and three virtual machines. In other disks i have other partitions of this machines that have more frecuent writes.

But proxmox have frecuent lags. Sometimes the proxmox interface frozens by some seconds. I review the SMART values and i saw one that scared me. I saw....

169 Total_Bad_Blocks: 609

I have attached to this post some screenshots of the state of this disk. I think that this disk must be replaced (Perhaps it's still in guarantee). But i'm not sure if this can be the problem. This system runs in a AMD Ryzen 5 3400G, proxmox 7.1-12 and have 64 Gb of RAM (82% occupied). The proxmox partition have 59% of free space.

I think the problem must be with the disk. Can anyone tell me if the values on this drive are normal or if I should replace it?
 

Attachments

  • Captura de pantalla 2022-05-11 083651.jpg
    Captura de pantalla 2022-05-11 083651.jpg
    122.7 KB · Views: 14
  • Captura de pantalla 2022-05-11 083707 - copia.jpg
    Captura de pantalla 2022-05-11 083707 - copia.jpg
    67.9 KB · Views: 12
  • Captura de pantalla 2022-05-11 083616.jpg
    Captura de pantalla 2022-05-11 083616.jpg
    10.8 KB · Views: 13
Hi,

as far as I know bad blocks don't have to mean that the SSD is about to break. Some drives come from the factory with bad block. They have some spare blocks for this reason. The drive in my PC has some as well:

1652257910102.png

According to the SMART values for the Bad blocks it still thinks its pretty OK. It says in the Worst column 100 is the lowest value it ever saw and 0 is the Threshold where it says the drive is about to fail. From the SMART values the drive looks OK. This means it is unlikely that it will fail, but not impossible :), SMART is not perfect and can't predict the future.
 
more than the entire space of the disk (500 Gb),
According to the third screenshot, doesn't your disk have 1000GB of storage?

Even so, (through some googling, I couldn't find a very reputable source for this sadly) it seems that as a rule thumb, about 1 bad block per GB of storage seems normal for newly bought SSDs. So even for a 500GB storage a bad block value of around 600 would still check out.
The fields you should really pay more attention to would be GROWN_BAD_BLOCKS and REALLOCATED_SECTOR_CT as these values represent how many blocks have gone bad since you started using the device. As these both are 0, your SSD should still be healthy.

I think the problem must be with the disk.
If smartctl or similar tools don't show any other concrete errors, I wouldn't be so sure. A freezing, slow and unresponsive system could just as well be a problem with high CPU or RAM usage. I.e. other programs are hogging most of the CPU or the system starts swapping RAM memory to the disk, which can be quite slow.
Could you please check whether you are experiencing high system load with e.g. free -h or htop

Also just for completion's sake, what is your pve version? pveversion -v
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!