Proxmox and SSD's

tgx

New Member
Jun 17, 2024
19
1
3
We are nearing making a decision on which platform we will choose to purge VMware from our infrastructure. We are down to two products with Proxmox still in the running. One area of concern we have found during our tests is the rapid degradation of SSD's when running Proxmox. Literally after only a a few days worth of service the SMART wear indicator jumped 20%. We have had these same model SSD's in service with other installed systems for years and have never seen such rapid degradation. Anyone with insight into possible configuration issues that might result in this phenomena? It is a very useful feature to be able to see the SMART registers, but yielding some alarming results.
 
That is nothing special to Proxmox pve as it's the additional write amplication by zfs through the small checksums, fill and parity writes.
Use enterprise ssd's or zfs mirrors instead of raidz configs when using zfs.
 
Anyone with insight into possible configuration issues that might result in this phenomena?

Are you using:

1) ZFS on root;
2) Clusters (which need to satisfy the extended virtual synchrony for the shared filesystem);
3) HA on constantly evaluating what's going on and updating it in the underlying filesystem mirrored onto the SSD?

Experiment with the combinations and see if this is the source of your issue.
 
We are nearing making a decision on which platform we will choose to purge VMware from our infrastructure. We are down to two products with Proxmox still in the running. One area of concern we have found during our tests is the rapid degradation of SSD's when running Proxmox. Literally after only a a few days worth of service the SMART wear indicator jumped 20%. We have had these same model SSD's in service with other installed systems for years and have never seen such rapid degradation. Anyone with insight into possible configuration issues that might result in this phenomena? It is a very useful feature to be able to see the SMART registers, but yielding some alarming results.
As has already been said, this is not due to Proxmox.
Only consumer SSDs without PLP are subject to such quick wear.
Were the SSDs previously operated on a raid controller with battery cache? The SSDs are extremely protected and the I/O is also written in an optimized way to keep wear to a minimum. If you then operate the same SSDs with a RaidZ1 or RaidZ2, for example, you have the highest possible write amplification and therefore much more wear.
If you want to turn your existing ESXi servers into PVE hosts and have cheap boot SSDs on a raid controller, simply leave the Raid1 in place and install PVE with ext4.
For the VMs I recommend DataCenter NVMe's and they have no problem with wearout, even with RaidZ setups.
 
  • Like
Reactions: leesteken
As has already been said, this is not due to Proxmox.

@tgx You will notice on this forum, that Proxmox VE is entirely flawless product. ;)

Only consumer SSDs without PLP are subject to such quick wear.

@tgx If you are curious, apt install iotop, then create e.g. 10 nodes (you can completely virtualise this), then create 10 resources, even just containers, best with shared storage off the nodes, even something completely idling by is fine. Then activate High Availability on them. Then check (on any single node that you won't be taking down during this exercise) - iotop -oP (once interactive press a for cumulative results), watch for (amongst others) pmxcfs, and start migrating those resources around a bit, you can simulate some nodes dying in the process, bring them back up, etc.). You can compare this with what's going through to systemd-journald. And make up your mind, imagine how it scales.

Were the SSDs previously operated on a raid controller with battery cache? The SSDs are extremely protected and the I/O is also written in an optimized way to keep wear to a minimum. If you then operate the same SSDs with a RaidZ1 or RaidZ2, for example, you have the highest possible write amplification and therefore much more wear.

@tgx And combine it with this piece of information and the choice of ZFS.

If you want to turn your existing ESXi servers into PVE hosts and have cheap boot SSDs on a raid controller, simply leave the Raid1 in place and install PVE with ext4.

@tgx Yes. And check if you mind the iotop numbers you get with the above experiment.
 
Last edited:
We are nearing making a decision on which platform we will choose to purge VMware from our infrastructure. We are down to two products with Proxmox still in the running. One area of concern we have found during our tests is the rapid degradation of SSD's when running Proxmox. Literally after only a a few days worth of service the SMART wear indicator jumped 20%.
I suspect that you ran VMware with a hardware RAID5 (with BBU?) and Proxmox with RAIDz1, but those are entirely different in performance and usable space. There are better ZFS configurations (with PLP drives) for VM, which will give you less write amplification. Or you can run Proxmox with the same hardware RAID5 (with/without BBU) if you want with your existing drives.
 
  • Like
Reactions: esi_y

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!