New Proxmox system up and working! But I goofed with SSD's, now worried about wearout. What do I do?

SoonerDave

New Member
Dec 20, 2022
6
0
1
I recently finished a "virtualize everything" setup on my home server with Proxmox, hosting a virtualized router (pfSense), pihole, TrueNAS, and a "playground/lab" Ubuntu Server box. I *thought* I had done my due diligence in research while getting everything together for this setup...until I came across a separate thread here today warning AGAINST the use of "consumer" SSD's. Alas. I missed it. I did *exactly* that.

I've had my system up and running now about two weeks, and I noticed my Proxmox system SSD already has 1% wearout, with over 642M LBA's written.

At this rate, it looks like I'll have to replace the thing in two years. Obviously I have plenty of time to plan accordingly, but clearly I should have seen and heeded the original advice of others here about using Enterprise-grade SSD's. Lesson learned. Mea culpa. :) The host SSD is a Samsung 870 1TB, and has all my VM's stored on it. I have a second identical one (presently unused) in the system.

With my overall setup pretty stable now, I thought I could image that boot SSD and just stash a copy of it on that drive, leaving it otherwise unused as a backup, so at least that gives me a strategy for doing something smarter over the longer term.

Admitting my error, and realizing I can't undo it immediately, I figure the best I can do right now is identify any defensive steps I can take not to tax that drive any more than necessary. It has no ZFS partitions, only the small BIOS boot partition, the EFI, and a resized LVM partitition that takes the rest of the drive (it didn't seem I needed the LVM-Thin for my initial setup). I was going to use the other SSD for a VM backup (and maybe I still will), but a full image backup almost seems a better idea.

What, if anything, can I do to mitigate the wearout rate on the SSD? Would tweaks to the configuration of the other VMs (TrueNAS comes to mind) offer any defensive measures?

As far as the rest of the config goes (not really sure its relevant, but perhaps something here will trigger some thoughts from the more learned here):
* TrueNAS VM (given 10GB RAM) runs on a 200GB drive on that SSD, and hosts four Toshiba N300 4TB NAS drives in Raid1 config.
* pfSense firewall/router (4GB RAM) and a 50GB drive
* piHole ad whitelisting DNS server, 2GB RAM, 50GB drive
* Ubuntu lab server (4GB RAM), on a 100GB drive
* One 1GB NIC (WAN facing) is set up for hardware passthrough to the pfSense box, while the other (2.5Gb) NIC is bridged to our home LAN and used by all four VM's via virtio
* I've got one brand new WD 3TB RED NAS drive in the box right now (given to me as a Christmas present), but not committed to anything permanently at this point, and a 4TB WD desktop drive I bought just a few months ago purely to expand the storage in my prior setup, but before I decided to rebuild it all, so it, too, is not entirely committed. THere's an old Seagate 2TB drive I'm keeping primarily from the old box in case there are any old files on there I want to pull out of it, but its already way into the 65,000 hour lifespan, so I wouldn't want to commit anything to it :)

Thanks for taking the time to read and appreciate any input. No brickbats please LOL :)

-sd
 
Admitting my error, and realizing I can't undo it immediately, I figure the best I can do right now is identify any defensive steps I can take not to tax that drive any more than necessary. It has no ZFS partitions, only the small BIOS boot partition, the EFI, and a resized LVM partitition that takes the rest of the drive (it didn't seem I needed the LVM-Thin for my initial setup). I was going to use the other SSD for a VM backup (and maybe I still will), but a full image backup almost seems a better idea.
Without LVM-Thin you get more overhead. LVM-thin will allow you to create thin-provisioned block devices as virtual disks for your VMs. If you just put your virtual disks on the "local" directory storage you store them as files, so you get nested filesystems and therefore more overhead. As a "raw" file you will lose snapshot functionality while it would work with "qcow2". But "qcow2" is a copy-on-write file system like ZFS, so again even more overhead.
For less wear, I would store the virtual disks as LVM Thin volumes.

What, if anything, can I do to mitigate the wearout rate on the SSD? Would tweaks to the configuration of the other VMs (TrueNAS comes to mind) offer any defensive measures?
Jup, ZFS will cause a lot of that wear. Inside your TrueNAS VM you could move the "system dataset" from the virtual disk on the SSD to the HDD ZFS pool. that way TrueNAS should write way less to that virtual disk on the SSD.

* pfSense firewall/router (4GB RAM) and a 50GB drive
Not sure about pfsense, but my OPNsense allows me to choose between ZFS and UFS when installing it. Here UFS would casue less write amplification.

* piHole ad whitelisting DNS server, 2GB RAM, 50GB drive
You could disable logging, so less gets written to piholes SQLite DB.

* I've got one brand new WD 3TB RED NAS drive in the box right now (given to me as a Christmas present), but not committed to anything permanently at this point, and a 4TB WD desktop drive I bought just a few months ago purely to expand the storage in my prior setup, but before I decided to rebuild it all, so it, too, is not entirely committed. THere's an old Seagate 2TB drive I'm keeping primarily from the old box in case there are any old files on there I want to pull out of it, but its already way into the 65,000 hour lifespan, so I wouldn't want to commit anything to it
You can't just buy any HDD. For ZFS and any copy-on-write filesystem, you should buy CMR (conventional magnetic recording) HDDs and avoid SMR (shingled magnetic recording) HDDs. In my opinion, SMR HDDs should be avoided no matter what you want to do with them. SMR is just terrible...write performance and latency will drop to unusable levels after CMR-cache and RAM-cache get full, slowing down the whole system. Basically the same problem you get when buying QLC SSDs.
Even the cheap "WD Reds" use SMR. For CMR you need to buy "WD Red Plus" or "WD Red Pro".
 
Last edited:
@Dunuin Thanks so much for the well-written and comprehensive reply.

I will look into shrinking that LVM partition and re-adding a LVM-Thin partition. It's nowhere near full, so I definitely have some flexibility there. I didn't fully understand the subtleties of the difference there between a VM based on a file vs a block device, and now that you explained it that way the light bulb just went on.

I will immediately look into moving the TrueNAS system dataset onto the HDD ZFS pool.

I originally thought I would be interested in the piHole logging, but I think I've looked at it a grand total of one time since I started it. It works, and that's really all that matters. I'll turn off that logging.

Those are all great suggestions and should offer immediate benefit. Exactly what I needed.

The "other stuff" at the end of the original post was just added for info; when I bought the first drive I had no notion of building a NAS. The other was a gift for Christmas. I *definitely* know to avoid those SMR drives!! At least I did *that* much homework :)
 
Quick follow-up:

I've opted for now to keep my "lab server" VM off unless I have a project. I turned off logging in piHole, and TrueNas system dataset was already set to the HDD pool.

It looks like migrating my VM disks (qcow files) will need me to shut each VM, dd the qcows into images and move them to lvm-thin storage, then reimport the VM image into ProxMox. Does that ring true? Am I overlooking anything or missing something obvious?

Thanks again!
 
You could just backup (the build-in function in the webUI) the VMs/LXCs and later restore them on the new storage.
Since PVE 7 it's also possible to move virtual disks between storages, and PVE will then automatically convert that virtual disk.
 
Last edited:
  • Like
Reactions: SoonerDave
You could just backup (the build-in function in the webUI) the VMs/LXCs and later restore them on the new storage.
Since PVE 7.2 It's also possible to move virtual disks between storages, and PVE will then automatically convert that virtual disk.
Fantastic. Thanks again!! Looks like I have a weekend project