Separating boot drive from VM/LXC drive + SSD/ZFS questions

jaytee129

Member
Jun 16, 2022
142
10
23
Thinking of running Proxmox OS on separate "budget" drives (2 mirrored using ZFS Raid 1) and use high end 2TB NVME PCI 4 S.2 SSD for all my VMs/LXC (2 mirrored with ZFS 1). The boot drives would also be the place for 'local' files, e.g. ISOs and backups that I would also copy to external NAS folder.

My understanding is that this helps in the following ways: Keeps all my high end drive space for VMs, and if boot drives fail or get corrupted I'll just need to restore a relatively small Veeam bare metal backup I would do (stored elsewhere than on boot drive). If VM drives fail I don't have to worry about OS, I just need to restore VMs from the backups I would have done to 'local' filesystem on boot drive. As I am relatively new to proxmox and haven't lived through any major failure or restore yet, would like to know if this would work / have the benefits I think it would.

Also, am reading stuff about ZFS being hard on SSD's, and certainly harder than if I used motherboard RAID (the other option I'm considering). Also read that ZFS perhaps does not support TRIM in proxmox (mixed messages on that). Also reading ZFS better than hardware RAID for several reasons, mostly notably around recovering from problems with h/w RAID due to proprietary implementations and even firmware differences in replacement h/w.

Is ZFS hard on SSD's? If so, would using cheap SSD's as boot drives cost me more in the the long run, i.e. I have to replace them every couple years or so? I haven't decided yet if budget drives will be HDD or SSD.

Does TRIM work or not with proxmox version of ZFS?

Lastly, my perception is that there isn't that much OS boot drive activity so performance isn't that important - hence the proposed use of budget drives. Is that true, or will having a "slow" proxmox OS drive create bottlenecks that will make putting the VM's on high end drives a waste of money? Should I use SSDs for boot drive or will HDDs be good enough?
 
In the past, I've seen better performance by separating the ZFS pools and not storing VMs on the rpool. Probably because of the different read/write patterns of Proxmox and VMs. I don't know if this is still the case.

Proxmox itself (on ZFS) should not be run from USB or SD card (lots of writes will wear them out), but it does not need high performance. But it does a lot of little writes that will wear out consumer SSDs. But it runs fine from a HDD, if you don't run VMs on the same ZFS pool.

Synchronous writes (that make the process wait until its actually written according to the drive) will slow down performance, unless your devices have a battery, like enterprise RAID controllers and enterprise SSDs. Consumer SSDs will have terrible write amplification on every little synchronous write because they have no other way to guarantee that it is stored in the flash memory. If they had a battery (or sufficient capacitors) they could combine lots of little writes and wear out the flash less. HDDs don't have this problem, but they have issues with lots of reads and writes (IOPS) as they cannot do as many as fast as SSDs.

Motherboard RAID is usually terrible in performance, does not give good data safetly and cannot be read by other systems (when the motherboard breaks).

Proxmox supports trim on many levels with and without ZFS.

If you're lucky @Dunuin will explain all of this to you much better than I can.
 
Thanks @leesteken. A couple of follow up questions if I may.

RE: proxmox host drive. Let's say I use some older 80GB HDD's I have from days when that was common size. To your point about HDDs' having "issues with lots of reads and writes (IOPS) " yet proxmox running "fine from HDD", I'm inferring there are NOT lots of IOPS in the case of proxmox OS. Is that correct? If not, is it worth putting a ZFS SLOG using 16GB Optane Memory module (~$50), which, if I understand correctly, is built for this purpose?

RE: VM drives on consumer SSDs. I will have the host on a UPS so was thinking of enabling write caching. Would the system then behave as though there was a controller with backup battery and combine many writes before committing to drive to minimize the actual writes to the SSD?

Lastly, how will ZFS RAM grab work? Will it act like one ZFS or two, each one trying to grab as much RAM as possible for itself? This might not be a big issue as I plan to have a min 24-32GB RAM left after allocating it for other stuff.
 
Last edited:
RE: proxmox host drive. Let's say I use some older 80GB HDD's I have from days when that was common size. To your point about HDDs' having "issues with lots of reads and writes (IOPS) " yet proxmox running "fine from HDD", I'm inferring there are NOT lots of IOPS in the case of proxmox OS. Is that correct? Is it worth putting a ZFS SLOG using 16GB Optane Memory module (~$50), which, if I understand correctly, is built for this purpose?
Proxmox is just one operating system worth of IOPS, which come mostly from its insistent logging. An SLOG only helps with sync writes and I don't think Proxmox does many sync writes.
RE: VM drives on consumer SSDs. I will have the host on a UPS so was thinking of enabling write caching. Would the system then behave as though there was a controller with backup battery and combine many writes before committing to drive to minimize the actual writes to the SSD?
I guess so if you use unsafe write caching, otherwise sync writes would still need to be written as soon as possible (because VMs are waiting on them). But many VMs will generate many IOPS and writes all over the place and I don't think they can all be combined. This is where an SLOG (instead of unsafe caching) might speedup the sync writes, but you would want more redundancy than just one Optane drive.
Lastly, how will ZFS RAM grab work? Will it act like one ZFS or two, each one trying to grab as much RAM as possible for itself? This might not be a big issue as I plan to have a min 24-32GB RAM left after allocating it for other stuff.
It's all one ARC and you can set the limits yourself.
I don't know what kind of I/O-load your VMs will be producing and I'm not an expert on these topics. There is no substitute for testing and tuning the system to your actual workload.
 
  • Like
Reactions: takeokun
I guess so if you use unsafe write caching, otherwise sync writes would still need to be written as soon as possible (because VMs are waiting on them). But many VMs will generate many IOPS and writes all over the place and I don't think they can all be combined. This is where an SLOG (instead of unsafe caching) might speedup the sync writes, but you would want more redundancy than just one Optane drive.
Jup, only with unsafe caching sync writes will be handled as async writes and can be cached. But here you have to keep in mind that any hardware problem or kernel crash does something similar to a power outage, so that could still corrupt your data, even with a UPS, when forbidding sync writes.
And with normal sync writes, without unsafe caching, using a UPS won't make your SSDs use caching to speed up sync writes or to reduce write amplification. That's because the SSDs firmware can't know that there is a UPS, so it won't try to cache sync writes on the internal RAM cache of the SSD. For that, you need enterprise SSDs with buildin powerloss protection so the SSD got its own integrated "backup battery", so it knows thats ok to use caching.
 
Last edited:
Thanks for the great info @leesteken and @Dunuin . A question that remains is, will having proxmox OS on HDD without any write caching result in any impact on the performance of my VM's running on separate, very high speed nvme SSD's? For the latter I'm looking at Samsung 980 PRO, which, from what I've read so far, should be able to handle lots of synchronous writes and last, thanks to "2GB Low Power DDR4 SDRAM (2TB)" cache.
 
A question that remains is, will having proxmox OS on HDD without any write caching result in any impact on the performance of my VM's running on separate, very high speed nvme SSD's?
In the past, I've seen better performance by separating the ZFS pools and not storing VMs on the rpool.
I think that that's about a separated as you'll get.
For the latter I'm looking at Samsung 980 PRO, which, from what I've read so far, should be able to handle lots of synchronous writes and last, thanks to "2GB Low Power DDR4 SDRAM (2TB)" cache.
Synchronous writes (that make the process wait until its actually written according to the drive) will slow down performance, unless your devices have a battery, like enterprise RAID controllers and enterprise SSDs. Consumer SSDs will have terrible write amplification on every little synchronous write because they have no other way to guarantee that it is stored in the flash memory.
And with normal sync writes, without unsafe caching, using a UPS won't make your SSDs use caching to speed up sync writes or to reduce write amplification. That's because the SSDs firmware can't know that there is a UPS, so it won't try to cache sync writes on the internal RAM cache of the SSD. For that, you need enterprise SSDs with buildin powerloss protection so the SSD got its own integrated "backup battery", so it knows thats ok to use caching.
I don't think 980 Pro has a battery/capacitors for power loss protection (PLP). Feel free to ignore the advice about using enterprise SSDs with PLP at your own peril.
 
jaytee129 said:
A question that remains is, will having proxmox OS on HDD without any write caching result in any impact on the performance of my VM's running on separate, very high speed nvme SSD's?
leesteken said:
In the past, I've seen better performance by separating the ZFS pools and not storing VMs on the rpool

Dunuin said:
I think that that's about a separated as you'll get.

Yes, I read that but it doesn't answer the question I asked. Maybe if I ask the question differently it will help.

Would there be a noticeable and meaningful impact on performance of VMs running on a separate high speed nvme SSD between these two?

1) Proxmox OS running on an SATA SSD with read cache only

2) Proxmox OX running on a SATA HDD with read cache only
 
Sorry let me change the question to :

Would there be a noticeable and meaningful impact on performance of VMs running on a separate high speed nvme SSD between these two?

1) Proxmox OS running on an nvme SSD with read cache only with same speed characteristics as the VM SSD

2) Proxmox OX running on a SATA HDD with read cache only
 
If Proxmox is on another ZFS pool it will have the least amount of impact on the VMs. This is regardless of HDD or SSD. I'm sorry if I don't understand what you are asking for. I'll stop and hopefully someone else will answer your actual question.
 
Last edited:
For the latter I'm looking at Samsung 980 PRO, which, from what I've read so far, should be able to handle lots of synchronous writes and last, thanks to "2GB Low Power DDR4 SDRAM (2TB)" cache.
Like already said, you need a enterprise SSD with powerloss protection to be able to make use of the RAM cache when using sync writes. The Samsung Pro is just a consumer SSD, so it won't be able to use that 2Gb RAM cache for sync writes and performance won't be great.
A question that remains is, will having proxmox OS on HDD without any write caching result in any impact on the performance of my VM's running on separate, very high speed nvme SSD's?
It shouldn't as long as you really only use the HDD for the OS.
 
Thinking of running Proxmox OS on separate "budget" drives (2 mirrored using ZFS Raid 1) and use high end 2TB NVME PCI 4 S.2 SSD for all my VMs/LXC (2 mirrored with ZFS 1). The boot drives would also be the place for 'local' files, e.g. ISOs and backups that I would also copy to external NAS folder.

My understanding is that this helps in the following ways: Keeps all my high end drive space for VMs, and if boot drives fail or get corrupted I'll just need to restore a relatively small Veeam bare metal backup I would do (stored elsewhere than on boot drive). If VM drives fail I don't have to worry about OS, I just need to restore VMs from the backups I would have done to 'local' filesystem on boot drive. As I am relatively new to proxmox and haven't lived through any major failure or restore yet, would like to know if this would work / have the benefits I think it would.

Also, am reading stuff about ZFS being hard on SSD's, and certainly harder than if I used motherboard RAID (the other option I'm considering). Also read that ZFS perhaps does not support TRIM in proxmox (mixed messages on that). Also reading ZFS better than hardware RAID for several reasons, mostly notably around recovering from problems with h/w RAID due to proprietary implementations and even firmware differences in replacement h/w.

Is ZFS hard on SSD's? If so, would using cheap SSD's as boot drives cost me more in the the long run, i.e. I have to replace them every couple years or so? I haven't decided yet if budget drives will be HDD or SSD.

Does TRIM work or not with proxmox version of ZFS?

Lastly, my perception is that there isn't that much OS boot drive activity so performance isn't that important - hence the proposed use of budget drives. Is that true, or will having a "slow" proxmox OS drive create bottlenecks that will make putting the VM's on high end drives a waste of money? Should I use SSDs for boot drive or will HDDs be good enough?
What did you end up doing? I have the exact (!) same question
 
I just put everything on the 2TB enterprise SSDs (mirrored using zfs-1).

I did learn that there is a TRIM command that can either be run automatically (when lots of trimming to do) or as a periodic command/cron job, I went with a weekly cron job in middle of the night to avoid a trim occurring in the middle of the day via automatic option.
 
Last edited:
  • Like
Reactions: RedundancyUser
I just put everything on the 2TB enterprise SSDs (mirrored using zfs-1).

I did learn that there is a TRIM command that can either be run automatically (when lots of trimming to do) or as a periodic command/cron job, I went with a weekly cron job in middle of the night to avoid a trim occurring in the middle of the day via automatic option.
Can you share this cronjob(settings) or the source you’ve found the code in?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!