SSD Drives

Spartan67

Member
Apr 20, 2021
25
3
8
Good afternoon...

I've read through the forum trying to find the answer but I am still confused as to whether or not it is a good idea to use SSD drives. My cluster uses LVM-Thin

One of my hypervisors died (thank God for Proxmox Backup :)) and trying to decide on hardware for it's replacement. I keep hearing about wear out on SSD drives with Proxmox and would like to avoid having a failure because of it.


Thanks,
Ben
 
Are you planning to just run Proxmox on SSDs which is not that big o of a deal ? If you are planning to run your VM from SSDs there are few things to consider. What type of VMs , what is the storage type etc.

As a general rule you can run Proxmox on regular SSDs, running VMs on SSDs in production also depending on a number of VMs I would use a minimum of "Mix use" enterprise SSDs. I have an Enterprise Intel SSD labeled as mix use with 3 DWPD (drive writes per day) endurance and after 1.5 year running about 30+ VM on it, it still shows 100% endurance left.

If you have any systems/VM with a lot of writes then you need to consider drives with higher endurance.

In our Ceph cluster we have 12 Toshiba enterprise drives rated 10 DWPD spread among 4 nodes with 150 VMs running on it. After 4 years I see anywhere from 6% to 9% endurance lost while the drives are still working perfectly fine (I have backup drives in case one of them fails).

We also user for lowkey VMs regular spinners like 2.2TB Seagate 10K SAS drives and they perform very well in RAID1 or ceph cluster, they are much cheaper then SSDs , you can get them for 280$ and not to worry about endurance.
 
  • Like
Reactions: Spartan67
Jep. Using HDDs as your VM/LXC storage you easily run into problems (especially high IO delay) because the HDDs are lacking the IOPS performance needed to run multiple OSs in parallel. SSDs are totally fine, you just need to buy proper models and pay a multiple for them, like brucexx already explained. You usually only run into performance/endurance problems with SSDs, when buying consumer SSDs...and especially QLC consumer SSDs.
 
Good to know @_gabriel. @Dunuin & @brucexx how can you differentiate between the types of SSD's without relying on a company telling you it is...? What spec or feature tells you that it's an enterprise level piece of hardware...?

My main interest in using SSD is that as I understand it the IO delay is significantly reduced. And as well the benefit of a cooler running system and power savings.
 
Good to know @_gabriel. @Dunuin & @brucexx how can you differentiate between the types of SSD's without relying on a company telling you it is...? What spec or feature tells you that it's an enterprise level piece of hardware...?
Enterprise SSDs got a built-in power-loss protection. They can run for some seconds without power and that allows them to cache sync writes in their volatile DRAM-cache, resulting in way better sync write performance and less wear.
And they are rated for way higher amount of writes. A consumer SSD is usually rated for 0.2 to 0.3 DWPD. Read-intense enterprise SSDs are ususally at around 1 DWPD. Enterprise SSDs for mixed workloads at around 3 DWPD and enterprise SSD for write-intense workloads can be way higher like DWPD of 10 or even higher. So you can write 10x more data to an enterprise SSD with 3 DWPD, compared to a consumer SSD with 0.3 DWPD, without losing your warranty.
Enterprise SSDs also often use higher grade NAND chips. NAND grades from best to worst: SLC -> eMLC -> MLC -> TLC -> QLC.
Today you will usually only find SLC and TLC NAND chips in enterprise SSD. While the cheap consumer SSDs will use QLC NAND and the more expensive consumer and pro-sumer SSDs will use TLC NAND.
Enterprise SSDs often also got a way more spare capacity. I for example use S3710 200GB Enterprise SSDs. These got a TBW of 3600TB. The durability is so high because of the the better eMLC NAND chips. And while they only got 200GB usable capacity, they actually got 336GB of storage, where 136GB are spare cell that will replace failed cells. A 240GB consumer SSD usually only got 16GB of spare cells. But unluckily you won't find these numbers on the manufacturers datasheets.

So power-loss protection and a high TBW or DWPD rating would indicate an enterprise SSD when looking at the datasheets.
 
Last edited:
  • Like
Reactions: elmo and Spartan67
Enterprise SSDs got a built-in power-loss protection. They can run for some seconds without power and that allows them to cache sync writes in their volatile DRAM-cache, resulting in way better sync write performance and less wear.
And they are rated for way higher amount of writes. A consumer SSD is usually rated for 0.2 to 0.3 DWPD. Read-intense enterprise SSDs are ususally at around 1 DWPD. Enterprise SSDs for mixed workloads at around 3 DWPD and enterprise SSD for write-intense workloads can be way higher like DWPD of 10 or even higher. So you can write 10x more data to an enterprise SSD with 3 DWPD, compared to a consumer SSD with 0.3 DWPD, without losing your warranty.
Enterprise SSDs also often use higher grade NAND chips. NAND grades from best to worst: SLC -> eMLC -> MLC -> TLC -> QLC.
Today you will usually only find SLC and TLC NAND chips in enterprise SSD. While the cheap consumer SSDs will use QLC NAND and the more expensive consumer and pro-sumer SSDs will use TLC NAND.
Enterprise SSDs often also got a way more spare capacity. I for example use S3710 200GB Enterprise SSDs. These got a TBW of 3600TB. The durability is so high because of the the better eMLC NAND chips. And while they only got 200GB usable capacity, they actually got 336GB of storage, where 136GB are spare cell that will replace failed cells. A 240GB consumer SSD usually only got 16GB of spare cells. But unluckily you won't find these numbers on the manufacturers datasheets.

So power-loss protection and a high TBW or DWPD rating would indicate an enterprise SSD when looking at the datasheets.

So just for my understanding these drives say they are SLC yet they seem extremely inexpensive which makes me question the SLC claim.

https://smile.amazon.com/stores/pag...04af-ca2d-4443-ab5f-de7528899716&ref_=ast_bln
 
Don't get confused with SLC cache and SLC NAND. Basically all SSDs got a SLC cache. They use cheap TLC/QLC NAND and write to that TLC/QLC NAND in SLC mode. So they don't got real SLC chips. You basically only find SLC NAND today in Intel Optane SSDs, Industrial SD-Cards and SATA/USB DOMs.

If it doesn't cost you thousands of dollars per TB it is probably not using SLC NAND.
 
Last edited:
  • Like
Reactions: Spartan67
Are consumer SSDs going to be an issue for EXT4 partitions? I'm new to Proxmox and just setting up some simple experiments.

Here is my initial setup:

AMD 5700G (8-core)
64GB DDR4 RAM (non-ECC)
10G Ethernet pcie


Drives
Proxmox: 256GB NVME TeamGroup
VMs (Windows Server 2022, PopOS): 2TB NVE Crucial P3 PLus
LXC/Docker Images: 1TB Sandisk SATA SSD
Data/Scratch: 4x 500GB 2.5" HDDs (Raid 10)

I read a lot about recommendations for enterprise SSDs, and I intent to invest in some when I gain enough experience for a serious setup with ZFS and more complex configurations. For now I want to know if I will experience any issues using Ext4 for these drives.

Thanks
 
Are consumer SSDs going to be an issue for EXT4 partitions? I'm new to Proxmox and just setting up some simple experiments.
In general they work fine for ext4 or lvm setups. Do remember though that they will be written to by N machines rather than just one. So the expected lifetime will be less than it would be with the same number of physical machines.
 
  • Like
Reactions: iEngineered
FWIW, I have been running three proxmox nodes for over a year, on consumer grade SSDs (Team group mp 33 and mp 44 drives). I don't run a cluster, and the drives are mirrored, using ZFS and running VMs. I have the corosync, pve-ha-crm, and pve-ha-lrm services disabled. My drive wear out on two machines is 0%. and on the third machine it is 2%. My workloads are simple. a couple of VMs running simple docker containers, and some VMs running wordpress and nextcloud. All my application data, docker volumes, etc., sits on NFS shares and not on the SSD drives. In a nutshell I have had no issues with consumer drives. if they wear out in a year or two, I'll replace them and that will be fine. I am only running a home lab and self hosting some websites that don't generate any revenue for me, so nothing critical. I also back up all my VMs and CTs to an NFS share, so rebuilding a node from scratch would be pretty simple and not time consuming. I could probably do it in 15-20 minutes. If your environment is more demanding than mine, YMMV
 
Are your data and backup NFS shares on HDD RAID?
My NFS shares are served up from a separate NAS box that runs a RAID 1 mirror (BTRFS with scrubs and snapshots), I also have my NAS boxes connected to the switch with LAGG/LACP) . I regularly back up the main NAS a second separate physical NAS as well as to the cloud (Amazon Glacier)
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!