Please help me set up an enviroment for home use

kokoon · Feb 25, 2025

Hey people! First time poster here.

I'm a middle-aged, EU-based dude with background in software engineering. Nothing too fancy, but I can find my way around most of the "IT" problems I encounter in life. I have never, however, seriously ventured into neither virtualization, hypervisors, nor clusters.

My first motivation to get Proxmox set up was to migrate my Home Assistant away from Raspberry Pi, I was running it off an SD card and the secondary reason was to consolidate the rest of 24/7 household services to a unified hardware system: mainly the Roon server, Pi-hole and a torrent client - I had an old and loud desktop box for those so I wanted to get rid of that. I'll also need a windows VM at some point and I'm sure I'll find a bunch of services I'll want to run locally when the whole thing is up.

So took the plunge and fast forward a month - I have a Dell Optiplex 7080 micro with a 10500t and 32gb ram running all 4 aforementioned services, another identical box on the way and an old Haswell era NUC, along with the RPi4 that used to previously run Home Assistant. Oh, and I also have an old Synology DS409 for a SMB/NFS NAS. Network is unifi everything (UDR, etc)

I'm not sleeping well though, because the 2 boxes I've ran Proxmox on, both started acting up at some point. Before I got the Dell, I was testing out PVE on the NUC (even upgraded RAM in it) and every couple of days it froze and it needed to be hard reset. I didn't manage to find out the root cause, but I'm suspecting the SSD. Thermals were okay, RAM was brand new, SSD was an old Crucial mSATA with a cryptic 4% wearout that I still don't know if it's counting up or down. In any case, the Dell arrived and that was going to be my new server! I'm planning on setting up Home Assistant to run in high availability, so either 2 or 3 node cluster is in the future. But first I migrate the 2 VM + 2 CT (from backups) to a fresh 8.3.4 Proxmox. Btw I'm running PBS off the RPi4, storage is NFS and a USB WD drive.

Everything is beautiful, until the exact same thing (apparently) started happening on the Dell - everything becomes unresponsive, web interface isn't serving, can't SSH into hypervisor, can't SSH into any of the VMs, can't ping anything, the only thing that was showing signs of life was Home Assistant performing some very simple functions off cached frontend. But no point in diagnosing it any further, because yesterday the nvme (root) drive just died, so I guess that was the reason. Kinda suspicious though, two different SSDs in two different systems acting up?

So if you read up to this point - thanks for doing so. I will now list some questions and if you can pitch in with a suggestion, advice, comment - anything at all, I'll be grateful.

Main requirement is rock solid Home Assistant (tricky, I know - but that's a separate story) and Roon Rock, which means some kind of redundancy - will PVE HA do the job?
My plan is to have a 2 node cluster with identical(ish) Dell 7080 micro boxes and a qdevice running off the RPi4 that's also running PBS - it's not the worst idea for a homelab enviroment, right?
Including the NUC doesn't really make a lot of sense, so it would probably be a cold spare node, is there a better use for it?
The main thing I'm struggling with is, what to use for storage. I haven't dealt with SSDs a whole lot before, but apparently the differences are significant. This is what I really need help with, please!

I'm running out of funds, so I'd like to keep it cheap.
was it somehow my fault the drive(s?) are dying?
i don't need a ton of volume, I'm thinking I'd probably be perfectly fine with 500 gb per node.
I can fit 2x 2280 gen3 nvme, plus a 2.5" sata into the Dells.
do I need a separate, small but "enterprise-grade" drive for root? or do I really need those for HA'd VM's shared storage?
If I'll be getting into HA soon, does that affect the requirements of the SSDs?
I can't wrap my head around: ZFS being hard on SSDs, best way to set up shared storage for HA machines (CEPH I guess?), and ultimately, what affordable drives to get.
I don't think I need to run mirrored drives in nodes if I'll be doing HA, so I guess at least that means couple less drives.

I realize I might be all over the place, please help me out.

UdoB · Feb 26, 2025

kokoon said:
I'm running out of funds, so I'd like to keep it cheap.

You can't build a cluster from scratch including all bells and whistles and pay nothing. That's simply not the way it works.

kokoon said:
was it somehow my fault the drive(s?) are dying?

For me everything needs to have redundancy. At least the storage devices; power supplies in my homelab are not redundant while those in my dayjob definitely are. (And there are more aspects: networks, switches, UPS, multiple Backup systems etc.)

kokoon said:
do I need a separate, small but "enterprise-grade" drive for root?

For using ZFS "Enterprise class" is highly recommended. Personally I do not use anything else. See also: https://forum.proxmox.com/threads/f...y-a-few-disks-should-i-use-zfs-at-all.160037/

kokoon said:
or do I really need those for HA'd VM's shared storage?

If you go for ZFS and Replication, then: yes. And yes, that's what I do mostly.

Both the boot devices and the VM's storage may be the same. A single ZFS pool is fine. (Again: for a Homelab.) This means one single pair of identical (or very similar) devices.

kokoon said:
If I'll be getting into HA soon, does that affect the requirements of the SSDs?

Are SSDs required? No. But recommended, alternatively to NVMe.

kokoon said:
I can't wrap my head around: ZFS being hard on SSDs, best way to set up shared storage for HA machines (CEPH I guess?), and ultimately, what affordable drives to get.

For Ceph you need several nodes, not just two or three. See: https://forum.proxmox.com/threads/fabu-can-i-use-ceph-in-a-_very_-small-cluster.159671/

There are many, many, many ways to build a Homelab. The right way is that what works for you.

Usually 1) expectations, 2) the amount of work and the knowledge required and 3) having the money to spend are diametral metrics...

LnxBil · Feb 27, 2025

@UdoB already tackled all the points. The cost of having a good cluster worthy its name (dedicated or distributed shares storage without loss of written data) is just too high for your home and just requires too much electricity. I have been running with a single node with two enterprise SSDs at home for many, many years, which is the cost-effective way of doing it... including HA, Z2M and all other stuff you may need. Is this a single point of failure? Yes, sure, but everything else is too ... internet connection, switch, power, etc ... get reduncancy for things that break first, have backups and potential, yet not identical hardware as spare to switch to in case of an hardware failure.

kokoon · Feb 27, 2025

@UdoB and @LnxBil thanks!

Please help me think this through:

Since I had a relatively cheap way to get a whole 2nd box, I was thinking I don't need to get redundant storage in each of them. I realize that duplicating the whole node (and running cheaper consumer drives) means increasing the chances some other component in system failing. Still, it "feels" like I'm getting more redundancy for less money going this route vs higher grade redundant drives in single node. Am I at least partially making sense?
Similarly, having a redundant node, made me believe I don't even need to be running ZFS (and thus can use cheaper storage). I do admit though, that I'm still pretty confused about all the various benefits of ZFS - currently I see it mostly as data integrity mechanism, I might be missing a bigger point. Could you either explain in short, or suggest a good online resource to understand this? Why are you guys using ZFS? (btw @UdoB I just opened your post you linked about ZFS, I guess most of my questions are answered there - I will read that, thanks!)
Regarding storage - @UdoB do NVMe drives not count as SSD? I thought it's all solid state drive, but different interface. In any case, I'm looking at either 970 evo plus, or wd red sa500 - is that good enough? It's a level of expense I could afford right now.

UdoB · Feb 27, 2025

kokoon said:
Since I had a relatively cheap way to get a whole 2nd box, I was thinking I don't need to get redundant storage in each of them. I realize that duplicating the whole node (and running cheaper consumer drives) means increasing the chances some other component in system failing. Still, it "feels" like I'm getting more redundancy for less money going this route vs higher grade redundant drives in single node. Am I at least partially making sense?

Sure. As long as you replicate all VMs then you're right. I must admit that I do not always do this, because sometimes it is "just a Test-VM", or be it because of the pure size of a VM I explicitely do not want to replicate.

I want all nodes to work reliably. With mirrors in my ZFS pools the chance that a dying SSD kills a full node is... near zero. Of course there are other sources of trouble too.

So for me my rule is: always ZFS and always mirrors. But yes, there are excemptions. For example some of my Mini-PCs only have a single M.2 slot (besides dual SATA), so I can not mirror them. Those offer this single device to be used as an Ceph OSD. Or the M.2 could be the boot device, without storing VMs on it: this approach would reduce the amount of data written massively - though this would contradict my wish to not having the node die if a single storage device goes south.

Note also that for Ceph (with an adequate number of nodes) there is no redundancy inside one node. The "failure-domain" of Ceph usually is "host".

kokoon said:
Regarding storage - @UdoB do NVMe drives not count as SSD?

Yes, for sure they do

Unfortunately there seem to be only a very few NVMe with PLP in the price range I am interested in. I didn't check lately..., but in my Homelab my NVMe do not have this.

You see... at the very end you are forced to accept a compromise of "best practice" and "actually feasible for me".

kokoon said:
I'm looking at either 970 evo plus, or wd red sa500 - is that good enough?

Check the data sheet and verify the presence of "Power Loss Protection".

Good luck

leesteken · Feb 27, 2025

kokoon said:
I'm looking at either 970 evo plus, or wd red sa500 - is that good enough? It's a level of expense I could afford right now.

I'm still happy with my old 970 EVO Plus but it depends a lot on the workload. The SA500 has four times the durability (at twice the price) but is slower due to it's SATA interface. Both don't have PLP.

kokoon · Feb 27, 2025

Yeah, I don't think I'll get a PLP drive anytime soon.

One more thing I can't figure out is - should I have a more durable drive for boot, or for VMs? On one hand, all the logging and other hypervisor stuff probably does a ton of writes on system drive. On the other hand, people usually recommend putting the root on a cheaper drive and have the good stuff for VMs. I guess it really depends on what the VMs are doing, there's probably no rule of thumb.

Like, I could spring (per node) for the smallest sa500 for boot, and then a pair of cheaper NVMe drives for a mirror to put VMs on.

Still no idea if and how I should set up ZFS for any of that.
For boot, I guess not at all? Or definitely yes, cause the cache-enabled drive can take it okay(ish)?
And for a mirrored pair of non-cache cheaper drives, no ZFS - they're redundant anyways?
But isn't mirroring SSDs a bad idea, cause they both go through same stuff and likely fail around the same time anyway?
And ZFS pools... How do the multiple physical components in the pool even communicate? I assume the units that are in separate machines need a dedicated network, and a fast one at that (>1gb). I don't have any of that, can I still benefit from distributed ZFS? I saw there's also ZFS over NFS, can I use hard drives in my NAS to any benefit?

leesteken · Feb 27, 2025

kokoon said:
One more thing I can't figure out is - should I have a more durable drive for boot, or for VMs? On one hand, all the logging and other hypervisor stuff probably does a ton of writes on system drive. On the other hand, people usually recommend putting the root on a cheaper drive and have the good stuff for VMs. I guess it really depends on what the VMs are doing, there's probably no rule of thumb.

Yes, indeed, it depends. Proxmox works fine on an old HDD with slow read and (lots of) writes (and no durability issues on a CMR HDD). Your VMs need a lot of (random) IOPS as you are putting multiple systems worth of I/O on the drives, but it depends on the workload of your VMs.

LnxBil · Feb 27, 2025

kokoon said:
Since I had a relatively cheap way to get a whole 2nd box, I was thinking I don't need to get redundant storage in each of them. I realize that duplicating the whole node (and running cheaper consumer drives) means increasing the chances some other component in system failing. Still, it "feels" like I'm getting more redundancy for less money going this route vs higher grade redundant drives in single node. Am I at least partially making sense?

Statistically, the disk will fail first. SSD fail hard and are - at least in my experience - not recoverable. It's just ewaste that does not get recognized at all. If you can live with data loss, you can use only one. ZFS is the only filesystem that does currently incremental replication, so you may also need ZFS. Or you don't care even less about data loss and do manual replication.

kokoon said:
Similarly, having a redundant node, made me believe I don't even need to be running ZFS

replication requires ZFS.

If I were in your shoes, I would go with mirrored ZFS in one system and shelve the other identical system you got cheap so that you will have online redundancy with your ZFS mirror (and offline backups of course) and offline redundancy by the replacement hardware. You use less electricity, statistically less likely to have data loss and better time to recovery. I would also buy used enterprise SSDs, which are still good and put my important stuff including the os on them. I have this in every PVE box I have.

kokoon · Feb 27, 2025

LnxBil said:
Statistically, the disk will fail first. SSD fail hard and are - at least in my experience - not recoverable. It's just ewaste that does not get recognized at all.

Yeah I just experienced that last week.

LnxBil said:
ZFS is the only filesystem that does currently incremental replication, so you may also need ZFS. Or you don't care even less about data loss and do manual replication.

Ah, so it's as simple as that? If I want to do HA, that VM's drive will need to be ZFS? But that doesn't mean boot drive needs it as well, right?

LnxBil said:
I would also buy used enterprise SSDs, which are still good and put my important stuff including the os on them. I have this in every PVE box I have.

Any chance you could hook me up with some of that sweet stuff?

leesteken said:
I'm still happy with my old 970 EVO Plus but it depends a lot on the workload. The SA500 has four times the durability (at twice the price) but is slower due to it's SATA interface. Both don't have PLP.

I'm seeing them at roughly the same price, you sure you're not looking at bigger SA500?

Another option I'm considering - unused 16gb M10 optane NVMe cards go fo peanuts (like 5 EUR) on ebay/aliexpress. That should be plenty for a boot drive, right? Is there a good reason not to go that route? Then add in a 1TB SA500 (90ish EUR) for VMs and go ZFS on both.

leesteken · Feb 27, 2025

kokoon said:
Ah, so it's as simple as that? If I want to do HA, that VM's drive will need to be ZFS? But that doesn't mean boot drive needs it as well, right?

I think so but I prefer ZFS for the checksums against bitrot. Proxmox also writes a lot to Ext4/LVM (but that saves you the ZFS write amplification and metadata sync write overhead).

kokoon said:
I'm seeing them at roughly the same price, you sure you're not looking at bigger SA500?

Probably, sorry.

kokoon said:
Another option I'm considering - unused 16gb M10 optane NVMe cards go fo peanuts (like 5 EUR) on ebay/aliexpress. That should be plenty for a boot drive, right?

I install new Proxmox versions in a VM on a 14GB virtual drive. That allows be to transfer the installation from the VM via a "16GB" USB stick to the final SSD. I think it is fine but store your ISO and templates and swap space somewhere else.

UdoB · Feb 27, 2025

kokoon said:
Ah, so it's as simple as that? If I want to do HA, that VM's drive will need to be ZFS?

No. The actual requirement is to have "Shared storage". ZFS is not really shared but with replication it works! (And this is officially supported.) https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_requirements_3

See https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_storage_types - there are several options for the column "Shared"="yes".

kokoon said:
But that doesn't mean boot drive needs it as well, right?

No.

Where the OS lives and where the datastores for the VM volumes resides is completely independent.

kokoon · Mar 2, 2025

Update, I ordered the following:
5x 16GB optane NVMe
3x 400GB intel dc s3710 SATA

It was cheap (I think, the 3710s I might overpaid)... Under 200€ for everything.

Now picking up where I left, the current idea is 2 identical nodes in cluster (+ raspi qdevice), each with:

2x 16GB optane striped for boot
1x 400GB sata for VMs

One of each drives as spare, backups are sorted. All VMs are HA, via ZFS replication. Any drive dies, failover to 2nd node needs to happen. Not sure yet what exactly would need to happen next, but should be possible to restore back to full cluster health relatively quickly, although a host would need to be reinstalled if any of the (4) 16GB optanes fail.

What I'm thinking now, reading up on ZFS - could I partition up the 400GB so there's also a 16GB chunk on it that I can include in a RAIDz together with the the two 16GB drives? That way I'd have boot that doesn't die on any one drive failure, if what I'm thinking is even possible.

Restoring cluster would be easier, as I'd just need to swap out whatever drive died and the node would heal either by resilvering the boot RAIDz, or HA replicating the VMs storage.

I would love it to hear it's a magnificent idea and nothing can go wrong with it!

UdoB · Mar 2, 2025

kokoon said:
What I'm thinking now, reading up on ZFS - could I partition up the 400GB so there's also a 16GB chunk on it that I can include in a RAIDz together with the the two 16GB drives?

Yes, technically you can do that. (I don't know if the installer does allow this.) But my very personal recommendation is: no, don't!

You would probably need to install on a single device and add the second Optane later on, via CLI.

With 2*16GB striped you have zero redundancy, which is a bad idea. But to attach a SATA partition to build a RaidZ sounds really creepy! Partitions are a bad idea and SATA will slow down those Optanes to... SATA speed. Additionally complexity is raised as this construct is not a simple mirror.

If you really want to go that road: just install it. It will work. Then actually test several failure situations and handle them, e.g. physically remove the one Optane you initially installed PVE on (claiming it is dead) and try to boot for desaster recovery.

For me it looks always as a good idea to stick cleanly with well established best practice - not everything what is possible is a good idea ;-)

That said... if it works for you then it is fine - and I do a lot of not-really-recommended things in my Homelab too

kokoon · Mar 2, 2025

UdoB said:
But to attach a SATA partition to build a RaidZ sounds really creepy!

I know what you mean, haha! I will probably test this, when the stuff arrives. I read that proxmox allows installing on multiple devices, even different sized ones. I'm getting really eager on ZFS, so I can't wait to start playing around with it. Btw by far the best article for beginner like me was this: https://arstechnica.com/information...01-understanding-zfs-storage-and-performance/

If 16GB was 100% sufficcient for boot (actually everyting apart from VM/CT), I'd mirror the Optanes. But I got the feeling 16GB is just a bit too tight, @leesteken suggested I put swap somewhere else. TBH, my "local" datastore is currently at 4.41GB so I can't say I understand right now how 16GB would not be sufficient.

What makes the root partition grow so much? System logs is one thing, anything else? Swap, how much do I really need?
I'd much rather put stuff that would get me uncomfortably close to 16GB somewhere else, than stripe/concat the two Optanes. But where to put that stuff? Can I offload logs to my NAS? It's what I'd be doing with ISOs and the like.
If I end up going with mirroring the 16GB boot drive, how much would I available storage would I end up with, if I use ZFS?

leesteken · Mar 2, 2025

kokoon said:
But I got the feeling 16GB is just a bit too tight

The installer minimum is 8GiB, so that should be enough (but indeed tight).

kokoon said:
TBH, my "local" datastore is currently at 4.41GB so I can't say I understand right now how 16GB would not be sufficient.

I have some additional software installed (like vendor-reset including sources from github) and my Proxmox root (on ZFS) is less than 8GB.

kokoon said:
What makes the root partition grow so much? System logs is one thing, anything else?

I limit the system log to 64MB in /etc/systemd/journald.conf.

UdoB · Mar 2, 2025

kokoon said:
What makes the root partition grow so much?

I do not have that problem - that's why I can't say how much space is actually required.

Looking at seven MiniPC in my Homelab the "rpool/ROOT/pve-1" six of them are below 6 GB and only one has 8 GB occupied.

The problem is that some additional space is used temporarily. For example (and if I remember correctly) when downloading templates these are stored here in some temporary location first and then moved to the final destination. To find out which location it is and mount it from some other storage should be feasible.

Sorry, I have no overview of these mechanisms, but several "optional" (read: not required for the boot process) directories can be moved to the other datastore. Another example: "/var/lib/vz/template/" may hold large container-templates and large ISOs. It can be mounted wherever you want - instead of occupying space in the rpool.

kokoon · Mar 12, 2025

UdoB said:
Another example: "/var/lib/vz/template/" may hold large container-templates and large ISOs. It can be mounted wherever you want - instead of occupying space in the rpool.

This is exactly what happened - "/var/lib/vz/dump" got filled up by backup and the rpool ran out of space (100% use). I now deleted the contents of that directory, but need a solution (I need to be able to back up).

The configuration is as discussed above,

(rpool) 2x16GB Optane NVMe drives work nicely in RAID1
(zpool) 400GB SATA S3710 is hosting all the guests (as far as I could click my way through it)...

Two of such nodes in cluster, with additional rpi qdevice. I set this up yesterday, tested migrations and all, tested failover by unplugging a node, etc... everything works. I set up pbs backups and scheduled them for 01:00 but forgot to test that :facepalm:

What (I think) I need to do is - create a new zfs dataset on zpool and mount that to "/var/lib/vz/dump" or even "/var/lib/vz". I also saw the option to set tmpdir and dumpdir in "/etc/vzdump.conf".

kokoon · Mar 12, 2025

Update: turns out I misconfigured one of the backup jobs and placed it to local storage instead of pbs. Regardless, I managed to move the "/var/lib/vz" to the other zfs pool, by:

I installed cockpit + 45drives zfs barebones on one node to make my life easier. I know it's not a good idea to manage the system that way when you're running a hypervisor, but it's a low risk homelab environment.
Saw there that the "/var/lib/vz" is already a rpool zfs mount. Clicked my way through GUI creating a new zfs dataset on the big zpool, unmounted the rpool one (remounting to a default "/rpool/var-lib-vz"), then mounted the new one to the right location.

Then, after testing that backup works now, I re-did the same thing in CLI on the other node:

Code:

zfs create zpool/var-lib-vz
zfs set mountpoint=/rpool/var-lib-vz rpool/var-lib-vz
zfs set mountpoint=/var/lib/vz zpool/var-lib-vz

Tested everything, works. Now I'll just clean up - remove the rpool datasets on both nodes, uninstall cockpit and wait for the next bomb to explode

Please help me set up an enviroment for home use

New Member

Distinguished Member

Distinguished Member

New Member

Distinguished Member

Distinguished Member

New Member

Distinguished Member

Distinguished Member

New Member

Distinguished Member

Distinguished Member

New Member

Distinguished Member

New Member

Distinguished Member

Distinguished Member

New Member

New Member

We value your privacy