[SOLVED] NVME disk "Available Spare" problem.

GazdaJezda · May 15, 2024

Hi all!

I installed ProxMox in 2021 & till now it just works. My system is: PVE 6.4-15 - last from 6.4 branche. I know it is obsolete, but for my needs (home server) it just works, without any problem. It's there and just works... I almost forgot that I have it. Regarding operation, performance everything is great, but hardware...

Boot disc (/dev/nvme0n1 - Samsung SSD 970 EVO) is signaling me, that there is only 26% of "Available Spare" (sectors i presume) left on it:

This disc is system only / boot disc. For VM's I have separated ZFS raidz pool (4 SSD in kind of RAID setup) and also one additional SSD (connected via USB) just for VM backups. Other discs are feeling comfortably and without problems (below). But I believe that I need to replace a boot device (/dev/nvme) ASAP.

My plan is: I need to buy a new NVME disc - probably same as this one (if that is no ok, please suggest a better model) and then try to binary duplicate current disc to new one. Then put a new one into server and everything would be fine. This would be optimal, but I don't know if it will work? Are there any problems, that can occure during that process? Or, what is a best and proven way to replace a ProxMox NVME system disc?

Thanks!

gfngfn256 · May 15, 2024

What I would do after you get a new NVMe at least as large as old one:

1. Shutdown node.
2. Attach new NVMe to node (another slot or maybe with a NVMe 2 USB adapter etc.)
3. Boot up with Live Linux media (almost any Gparted, SystemRescue etc.) without mounting any HDD/SDD etc.
4. dd from old /dev NVMe 2 new one. MAKE SURE YOU CORRECTLY IDENTIFY OLD & NEW (If you don't its probably game over!).
5. Shutdown node.
6. Remove old NVMe (DO NOT DISCARD/ERASE in case something goes wrong + then you can simply reinsert).
7. Insert new NVMe
8. Boot node & you should be good to go.

I've never had success with Clonezilla when dealing with PVE - so just use dd - done so myself many times successfully.

Alternatively you could also make a zipped dd image of the original NVMe - as then you would always in the future be able to fully revert back to your currently working PVE OS, I do this regularly.

gfngfn256 · May 15, 2024

Regarding whether or not you need to change the drive - IDNK

What did available spare show last time you looked?

I would think the important part is Percentage used - which shows 0% - so at least S.M.A.R.T believes its still got a full life ahead!

I do know you can NEVER (accurately) rely on S.M.A.R.T. data.

Kingneutron · May 15, 2024

https://www.youtube.com/watch?v=g9J-mmoCLTs
Veeam agent for Linux bare-metal backup/restore

https://www.youtube.com/watch?v=0SoQqGnoetM
If you need to resize the partitions up after restore

Kingneutron · May 15, 2024

> My plan is: I need to buy a new NVME disc - probably same as this one (if that is no ok, please suggest a better model) and then try to binary duplicate current disc to new one

I would go with a Pro model instead of EVO, and look at the TBW rating - the higher the better.

Otherwise check ebay and see if they have refurb Enterprise SSD that fits your sizing needs

Kingneutron · May 15, 2024

> Regarding whether or not you need to change the drive - IDNK

I checked the smart stats on my 2x (new) nvme's and they're both at 100% for spare threshold, so yeah it's probably a good idea to replace the drive. If you have a free slot or an adapter, you can still put it into secondary-storage use for backups or whatev until it dies.

PROTIP - if you're not running a cluster, turn off cluster services. This will limit writes to the OS drive. You may also want to setup zram and log2ram.
Also set ' noatime ' on all filesystems and ' atime=off ' on ZFS.

Ramalama · May 15, 2024

dd if=/dev/zero of=/root/zeroes bs=$((1024*1024)); rm -f /root/zeroes
fstrim -v /

Wait then 5 Minutes or so and reboot once, just to get sure and check with smartctl again

It has only 778GB Written, that SSD is basically brand new, lol

gfngfn256 · May 15, 2024

I'd be careful with that - in case the disk is really failing.

Ramalama said:
Reboot once

Better - shutdown & all power off - then restart.

Ramalama · May 15, 2024

gfngfn256 said:
I'd be careful with that - in case the disk is really failing.

Better - shutdown & all power off - then restart.

Thats a good advice! Shutdown/poweroff/start fixed some nvme issues i had in the past either with consumer nvme drives.

GazdaJezda · May 16, 2024

gfngfn256 said:
Regarding whether or not you need to change the drive - IDNK

What did available spare show last time you looked?

I would think the important part is Percentage used - which shows 0% - so at least S.M.A.R.T believes its still got a full life ahead!

I do know you can NEVER (accurately) rely on S.M.A.R.T. data.

Yes, I also don't know if it is needed to do something, but googling tell me, that when that metric fall below 10, then is critical. I was aware of disc wearing (SSD's) when I install ProxMox (switch from ESXi). So i regulary watch (doing screenshots of Disks section in ProxMon android app) for those numbers. During last few years it goes (when I look and value differs from last noted):

2022-10-07: 77 %
2023-10-15: 52 %
2023-11-10: 48 %
2023-12-22: 47 %
2024-01-16: 39 %
2024-04-11: 26 %

So, i believe I need to do something now, since I really need that server keep on running.

Alternatively you could also make a zipped dd image of the original NVMe - as then you would always in the future be able to fully revert back to your currently working PVE OS, I do this regularly.

If I imagine that, doing that, i will make a snapshot of current nvme boot disc and have it ready for later extraction to new disc? If yes, can you please tell a little bit more (how can I do that)? Can it be done without live CD (my server do not have a CD/DVD unit)? I would like to do that asap and store it. Then I will buy a new disc & restore that snap to it and try switching it physically on a server. That sounds almost a perfect solution

Thank you.

GazdaJezda · May 16, 2024

Ramalama said:
dd if=/dev/zero of=/root/zeroes bs=$((1024*1024)); rm -f /root/zeroes
fstrim -v /

Wait then 5 Minutes or so and reboot once, just to get sure and check with smartctl again
It has only 778GB Written, that SSD is basically brand new, lol

Yes, it was brand new when put in computer (now is 3 and a half years old and running 24/7). Only ProxMox was installed onto. If I understand you correctly, you suggest I run following commands:

dd if=/dev/zero of=/root/zeroes bs=$((1024*1024))
rm -f /root/zeroes
fstrim -v /
shutdown (& power off server completely, wait few minutes then restart it and check again)

If that is correct I can do that later today.

gfngfn256 · May 16, 2024

GazdaJezda said:
So, i believe I need to do something now, since I really need that server keep on running.

It looks like it will soon fail.

IN MY OPINION YOU NEED TO BACKUP STRAIGHT AWAY.
I WOULDN'T DO ANYTHING ELSE. (DO NOT DO WHAT IS SUGGESTED IN THE POST/S ABOVE Of TESTING THE DRIVE FURTHER - YOU MAY KILL THE DRIVE WITH THESE TEST/S)

GazdaJezda said:
Can it be done without live CD (my server do not have a CD/DVD unit)? I would like to do that asap and store it.

No you need some Live Linux Media, USB will do. (Same way you installed Proxmox in the first place - without CD/DVD unit?). You can only do it when the NVMe is unmounted & not in OS use.

You'll need another media (except for Live Linux Media) on which to store the zipped image from the failing NVMe.

So what you should do:

1. Shut down node.
2. Boot up with Live media.
3. Attach extra storage media to node.

Then issue following command/s:

Code:

mount /dev/xxx /mnt
#(mount extra storage device)


tmux
#(only optional - to enable leaving the process running)


dd if=/dev/YYY bs=32M status=progress conv=sync,noerror | gzip -c > /mnt/prxmx_node_name$(date +'%Y_%m_%d_%I_%M_%p').img.gz
#(YYY is your failing NVMe system os disk)

This will create a zipped image (name timed-stamped for future ref) of your failing NVMe & store it on your (mounted) storage location.

Good luck.

GazdaJezda · May 16, 2024

Thank you! Will do that today after work. Will post a result here. Also, I will order a replacement. Just need to create a bootable USB. I don't remember how i installed it, honestly

I have a SuperMicro board with IPMI, maybe i mount an external CD or similar, really don't remember.

Ramalama · May 16, 2024

GazdaJezda said:
Yes, it was brand new when put in computer (now is 3 and a half years old and running 24/7). Only ProxMox was installed onto. If I understand you correctly, you suggest I run following commands:

dd if=/dev/zero of=/root/zeroes bs=$((1024*1024))

rm -f /root/zeroes

fstrim -v /

shutdown (& power off server completely, wait few minutes then restart it and check again)

If that is correct I can do that later today.

dont do the dd and rm -f commands separately, du it as one command, exactly as i posted above.
because the first command will write zeroes to your drive (into the zeroes file), till there is absolutely no space left, and the second will delete the zeroes to make space again.

So basically as one command, your drive will be completely full, just for less as one second, which shouldn't cause any issues with services or anything that writes to the root partition, like logs... no "out of space" errors...
If you do it separately, your drive will be simply "out of space" for a longer time, till you delete manually the zeroes file. So the period is simply longer where the drive is full. If you wait to long, it can happen that some service fails etc...

fstrim afterwards marks the blocks where the zeroes were (basically all the free space) as empty.
fstrim will send the cleaning command/discard command direct to the firmware of your nvme, which will clean/mark the blocks as empty by itself.

TBH, The available spare degradation could have several reasons, in my opinion 778GB Written is simply not enough for wearout.
Or in other words, if its indeed wearing, then the drive is definitively defective from beginning.
A refresh cycle, to keep the data alive, shouldn't cause wearout. Reading doesn't cause wearout either.
Its probably even a firmware issue, you may need to update the firmware.
But tbh, i think the most likely case is simply that there is data that was deleted, but the drive simply was never trimmed, or something like that.

However, like others say, spare degradation can be indeed a sign of real drive degradation either. (I would blindly confirm that, if the drive had 50+ TB written, but not with 778GB written)
But as there is a chance, that this could be indeed a degradation, you should backup it, just to be sure.

Cheers

GazdaJezda · May 16, 2024

Please, just to clarify this and then I will end with 'bugging'

My discs are:

Problematic disc is: /dev/nvme0n1

Disks: /dev/sda , /dev/sdb , /dev/sdc , /dev/sdd - this are used in ZFS raidz volume.

Backup disc: /dev/sde is already SSD disc, connected via USB holder. I can mount it and use for saving backup there? It already contains VM backups. Is that ok or can it be a problem? If it is fine, then below commands are:

Code:

mount /dev/sde1 /mnt
#(mount extra storage device)

tmux
#(only optional - to enable leaving the process running)

dd if=/dev/nvme0n1 bs=32M status=progress conv=sync,noerror | gzip -c > /mnt/prxmx_node_name$(date +'%Y_%m_%d_%I_%M_%p').img.gz
#(YYY is your failing NVMe system os disk)

Is that correct for use? I'm asking because I have a problem with device naming (not so famiiar with it):

/dev/nvme0n1

or

/dev/nvme0

Thank you again!

P.S. - when installed ProxMox, I used a "Mount Virtual drive" BIOS option for mounting a ISO drive as bootable. Will use that again to mount Live Linux ISO.

Ramalama · May 16, 2024

/dev/nvme0n1
-> Thats the namespace of the nvme, means the actual disk where data/partitions are on them.
/dev/nvme0
-> Thats the raw disk itself, you can split it into multiple namespaces, if the disk supports it, for passthrough for example, so imagine it as pcie port itself or something, and the namespaces on that are like the disks itself.

Thats the easiest description i could write.
So you actually are mounting/formatting/dd/etc... whatever you do on the "usual" drives, you use the namespaces for that. in your case nvme0n1

Its the same as with nics that support sr-iov, you probably seen those, the start with enp0f0s1, enp0 is the whole PCIe NIC itself, f0 is primary function, means the port of the nic, np0 or s1 is the function, cause you can split the nic port to multiple virtual functions, if you want to passthrough the vf to a vm.
You can do that to avoid using virtio-nic for example, the virtual function of the nic is for example faster as an emulation layer like virtio.
But there are some downsides to that either, like the vmbr won't be able to communicate with a virtual function, without some tweaks.

Same thing for nvme's, you could split them for passthrough reasons either. But i don't know anyone who is doing that.
Cheers

GazdaJezda · May 16, 2024

Thanks! Will use nvme0n1 for cloning / backing up current disc. I also order a replacement (SAMSUNG 970 EVO PLUS 500GB SSD instead of ~~SAMSUNG 980 PRO 500GB SSD~~ - this one is PCI-E v. 4, my board only support v. 3, my HW seller spot that). Next week will be here.

Best regards!

gfngfn256 · May 16, 2024

Ramalama said:
So basically as one command, your drive will be completely full

Ramalama said:
fstrim afterwards

As I have already pointed out to OP above, I wouldn't do anything like you have suggested - this itself could contribute to drive failure!

Ramalama · May 16, 2024

gfngfn256 said:
As I have already pointed out to OP above, I wouldn't do anything like you have suggested - this itself could contribute to drive failure!

Lets simply see in a week or so, after he got his drive and a backup.
Then he can do that without any fear and check smartctl again, or in worst case replace the drive.

GazdaJezda · May 16, 2024

Yes. I will post updates here!

Regards!

[SOLVED] NVME disk "Available Spare" problem.

Well-Known Member

Distinguished Member

Distinguished Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

Distinguished Member

Renowned Member

Well-Known Member

Well-Known Member

Distinguished Member

Well-Known Member

Renowned Member

Well-Known Member

Renowned Member

Well-Known Member

Distinguished Member

Renowned Member

Well-Known Member

We value your privacy