[SOLVED] Advice on configuration for Proxmox VE

Nov 27, 2023
175
40
28
Netherlands
daniel-dog.com
Hello everyone,

I recently got a dedicated server and installed Proxmox VE 7.4-17 on it. (I am planning to upgrade to Proxmox VE 8.1 A.S.A.P but there is an issue on the hoster side that makes this not possible at the moment.)
I am currently trying out a couple of possible configurations for production/deployment and would like some thoughts/recommendations on it.

I am plannng on hosting a couple of VMs with Debian 12/11 and Debian 12 + DirectAdmin.
All the VMs are for me personally and thus I am not concerned about the uptime being 100% all the time but for me reliablilaty and uptime is an important one.

The first possible configuration is:
- Proxmox VE on the 1TB SSD.
- VM disks and Cloud-Init configs on one 2TB SSD.
- Local backups of the VMs on one 2TB SSD.

The second possible configuration is:
- Proxmox VE on the 1TB SSD
- Using ZFS for both 2TB SSDs and putting them in software RAID 10 or mirror.

The third possible configuration is:
- Same as the second except using the 1TB SSD for both Proxmox VE and backups.


The upsides of the first configuration seems to be:
- The VMs have there own disk togather and thus reducing the impact on the host/Proxmox VE due to a lot of reads/writes at the same time.
- There is one disk dedicated for backups and thus allowing the boot disk and/or the VMs disk to fail without VMs data loss.

The downsides of the first configuration seems to be:
- The VM disks are not in software RAID 10/mirror and thus if the VMs disk fails it creates a lot of downtime. (Since the VMs disk need to be replaced before the VMs can be restored.)

The upsides of the second configuration seems to be:
- If one of the two disks fails for the VMs, then the VMs can still continue to run since it still has one disk for the VMs. (Since they are redundant thanks to software raid 10/mirror)

The downsides of the second configuration seems to be:
- The configuration does not allow for a backups on its own disk.

The upsides of the third configuration seems to be:
- The disks for the VMs are redundant.
- There is space for local backups.

The downsides of the third configuration seems to be:
- Proxmox VE and the backups need to share the disk and thus increasing the reads/writes and as a result wearing out the SSD quicker. (And since (as far as I know) the VM settings/configs are only stored in Proxmox VE and backups, it creates a real problem if that SSD fails)
- When backups are run, it mights slow down the host/Proxmox VE since they share the same SSD.

My server configuration:

- 1x Intel Xeon E5 2630v4 (10 cores/20 threads)
- 1x 256 GB REG ECC
- 1x 1 TB SSD (No RAID) - Boot disk (Crucial MX500 1TB 3D NAND SATA 2.5-inch/CT1000MX500SSD1)
- 2 x 2 TB SSD (No RAID) (Crucial MX500 2TB 3D NAND SATA 2.5-inch/CT2000MX500SSD1)

Any advice and/or suggetions are appriciated.
 
- 1x 1 TB SSD (No RAID) - Boot disk (Crucial MX500 1TB 3D NAND SATA 2.5-inch/CT1000MX500SSD1)
- 2 x 2 TB SSD (No RAID) (Crucial MX500 2TB 3D NAND SATA 2.5-inch/CT2000MX500SSD1)
You will not have fun with ZFS on those consumer disks.


With 3 disks like in your setup, there is no point in running RAID. If the non-RAID disks fails, you will not have a working system anymore and are without backups unable to get everything up an running again. So run RAID for everything (e.g. PVE + VMs and use the remaining disk for backups and iso/container templates) or have proper external backups.
 
  • Like
Reactions: _gabriel
You will not have fun with ZFS on those consumer disks.

I would just run it with noatime, at 2TB the TBW is not bad for a "consumer" disk.

With 3 disks like in your setup, there is no point in running RAID. If the non-RAID disks fails, you will not have a working system anymore and are without backups unable to get everything up an running again. So run RAID for everything (e.g. PVE + VMs and use the remaining disk for backups and iso/container templates) or have proper external backups.

This one is a no-brainer, actually. Just not sure what OP meant by "RAID 10" on the 2 disks.
 
Thanks for the advice so far.


This one is a no-brainer, actually. Just not sure what OP meant by "RAID 10" on the 2 disks.
What I mean by that is configuring the two disk via the GUI as a ZFS and selecting RAID 10. (Since I would asume this will make it RAID 1 since there are just two disks with the same size.)
In that same menu I also saw the option to mirror both disks, but I have no context to what the advantages/disadvanced would to to use mirror indead of RAID 10 for the ZFS.


With 3 disks like in your setup, there is no point in running RAID. If the non-RAID disks fails, you will not have a working system anymore and are without backups unable to get everything up an running again. So run RAID for everything (e.g. PVE + VMs and use the remaining disk for backups and iso/container templates) or have proper external backups.
My VMs with DirectAdmin also have backups since I also backup DirectAdmin itself and upload it to a S3 instance externally.
My templates allow for quick installs in case of a VM data loss but I feel like prevention is better then reacting when there is data loss.
You mentioned that it will be better to have Proxmox VE in (software) RAID and a backup disk.
So would it be better then to use software RAID 10 on the two 2TB SSDs for Proxmox VE and have Proxmox VE and the VMs share the same "disk"? (Since it will also increase the workload for the SSDs if these two are combined and possibly wearing them out way quicker.)
 
What I mean by that is configuring the two disk via the GUI as a ZFS and selecting RAID 10. (Since I would asume this will make it RAID 1 since there are just two disks with the same size.)
In that same menu I also saw the option to mirror both disks, but I have no context to what the advantages/disadvanced would to to use mirror indead of RAID 10 for the ZFS.

Maybe I do not use GUI enough, but not sure about this. Anyhow if it basically meant mirroring (term used by ZFS anyhow), that's all clear now.

My VMs with DirectAdmin also have backups since I also backup DirectAdmin itself and upload it to a S3 instance externally.
My templates allow for quick installs in case of a VM data loss but I feel like prevention is better then reacting when there is data loss.
You mentioned that it will be better to have Proxmox VE in (software) RAID and a backup disk.
So would it be better then to use software RAID 10 on the two 2TB SSDs for Proxmox VE and have Proxmox VE and the VMs share the same "disk"? (Since it will also increase the workload for the SSDs if these two are combined and possibly wearing them out way quicker.)

I can't speak for @LnxBil, but the way I (mis?)read it was that since you had the scenario of having PVE on (single-point-of-failure) drive (your intention having the writes separate, but that's not an issue on NVMe for sure not) AND backups on the same, what is the point of having redundancy for the VMs. Cardinal rule is that RAID is not a form of backup (ask anyone with a failed hw controller or just a bad day on a command line). So it's no use the VMs have the underlying storage redundant when you do not provide the same for the hypervisor. I think having backups (understanding the limitations of them being local) on the 1TB is fine.
 
And I would say this is alright as well, just the 1TB is overkill for that.
It definitely is but it is the minimum my hoster offers and it also allows me to have the templates on same disk as Proxmox VE.
I am currently experimenting with creating the templates on the fly. (Having a script download the cloud images and configure the VMs and turn them into templates.)

My only concern as mentioned by @LnxBil is in all configurations Proxmox VE is not on a redudant disk. (And this might be a problem since Proxmox VE does write a lot of data over the day and thus wearing out the SSD quicker.)
I do have the option to possibly add a second 1TB SSD and allowing Proxmox VE to be on a redundant disk.
This does make it more expensive however (25 euro/month extra) and I feel like for just Proxmox VE and templates, it should not wearout the disks quickly enough to really need to plan for it. (If the disk fails, it would cause Proxmox VE to not work but since I also have local backups to restore, it shouldn't be a big issue?)
But I would really like some advice on this.
 
My only concern as mentioned by @LnxBil is in all configurations Proxmox VE is not on a redudant disk. (And this might be a problem since Proxmox VE does write a lot of data over the day and thus wearing out the SSD quicker.)

What makes you think that PVE writes some exorbitant amount of data onto its own storage? And whether it's mirrored or not, it's the same amount (per disk) it will be writing.
 
What makes you think that PVE writes some exorbitant amount of data onto its own storage? And whether it's mirrored or not, it's the same amount (per disk) it will be writing.
I wouln't say exorbitant but given that it is normal for Proxmox VE to write 30GB/day (as I have read on the forum and also being the reason why consumer grade disks are not recommended) when idle and possibly having just one SSD for Proxmox with no redundancy it is best to minimize the wear on the SSD I would asume.

I also feel like given the TBW of 700TB, it would be more likely that the SSD fails for no reason rether then failing due to being worn out.
And also given that the SSDs are provided by my hoster and that they end up eating the cost if they wearout quickly/die, I don't know where the line would be between the randomly failing and failing due to being worn out.
 
I wouln't say exorbitant but given that it is normal for Proxmox VE to write 30GB/day (as I have read on the forum and also being the reason why consumer grade disks are not recommended) when idle and possibly having just one SSD for Proxmox with no redundancy it is best to minimize the wear on the SSD I would asume.
The problem is also the write amplification common in non-enterprise SSDs.


This one is a no-brainer, actually. Just not sure what OP meant by "RAID 10" on the 2 disks.
For RAID10, you need 4 disks, als you already wrote.


In that same menu I also saw the option to mirror both disks, but I have no context to what the advantages/disadvanced would to to use mirror indead of RAID 10 for the ZFS.
RAID10 is not directly possible with ZFS, a stripped mirror, however, is possible. The difference is the number of vdevs. A pool consists of at least one vdev. If you have more than one vdev, the vdevs are concatinated. The vdev itself has the data redundancy setting so that you can have a single disk or a RAIDz3 of x disks in the same pool.
 
A quick update and more of a final update for anyone who is interested.

I ended up changing the server configuration a little bit.
This is due to the boot disk not being redundant and thus having the potential for a lot of downtime.

My final configuration ended up being:
- 1x Intel Xeon E5 2630v4 (10 cores/20 threads)
- 1x 256 GB REG ECC
- 2x 1 TB SSD (No RAID) - Boot disk (Crucial MX500 1TB 3D NAND SATA 2.5-inch/CT1000MX500SSD1)
- 2 x 2 TB SSD (No RAID) (Crucial MX500 2TB 3D NAND SATA 2.5-inch/CT2000MX500SSD1)


I used the 2x 1TB SSDs for the boot disk and put them in software RAID 1. (But in reality being ZFS RAID 1)
The boot disk contains both the OS, templates and the installer files for the VMs/Containers.

I used one of the 2TB SSDs for the VM disks and VM data and the other 2TB SSD for backups. (With a backup strategy being to create backups local and to S3 storage (and S3 backups on two different locations that are geologically seperated) for a complete 1, 2 and 3 backup plan implementation.)

And since the the disks for the VMs and the backups are the same, I can always swap them around if the of the two disks fail. (The only downside is that this would require to have the backups moved to the boot disk but since this is only the case for emergancies, I would not worry about this too much.)

And lastly, thanks to everyone who help me come to this configuration.
 
I used one of the 2TB SSDs for the VM disks and VM data and the other 2TB SSD for backups. (With a backup strategy being to create backups local and to S3 storage (and S3 backups on two different locations that are geologically seperated) for a complete 1, 2 and 3 backup plan implementation.)

And since the the disks for the VMs and the backups are the same, I can always swap them around if the of the two disks fail. (The only downside is that this would require to have the backups moved to the boot disk but since this is only the case for emergancies, I would not worry about this too much.)
Sounds very complicated, whereas the KISS approach would be mirroring everything. I don't get people that buy server grade hardware (at least CPU and RAM), put in cheap non-enterprise SSDs and then do no mirroring at all of the thing that has the highest probability of breaking first.
 
Sounds very complicated, whereas the KISS approach would be mirroring everything. I don't get people that buy server grade hardware (at least CPU and RAM), put in cheap non-enterprise SSDs and then do no mirroring at all of the thing that has the highest probability of breaking first.
I get what you are saying in terms of the "consumer" grade SSDs (Even though the TBW is on the high end of consumer SSDs and on the low end of enterprise grade SSDs) but this is unfortunatly something that I have no input on.
I order the correct storage size in the correct amount of disk that I want and then my hosting provider decides on that the actual disk is gonna be.
And given that they are the ones who end up eating the cost of the SSDs if there wear out quickly and they also are running proxmox VE for there own VMs, I would not be two worried about it. (As long as I make them redundant for obvious reasons)

And for anyone who is interested, my hoster is Contabo.
 
I get what you are saying in terms of the "consumer" grade SSDs (Even though the TBW is on the high end of consumer SSDs and on the low end of enterprise grade SSDs) but this is unfortunatly something that I have no input on.

I find this kind of unsolicited part of repeated input you got somewhat good for bad aftertaste only (no hard feelings @LnxBil). You asked for advice on how to run what you have, you explained yourself you are not choosing hardware, yet not once, but twice you get the snide remark like the SSD from Micron which actually do have DRAM and do have PLP-like behaviour with 0.7PBW declared on it are some kind of China factory leftovers.

I do not get this on this forum sometimes. Like we all have our preferences, but why impose it on others for no good reason. We do not know how much the SSDs are going to be shredded (you can monitor that over time, alert your hoster) or not because we don't even know the workload. For very serious redundancy concerns, people run clusters. And the whole point of solution like PVE (cluster or not) is that you can very much even run it on anything even if it's burning out like incandescent lightbulbs and keep swapping parts and keep going. There's people out there putting Fury Renegrades into PVE nodes believing the 2PBW declared on something costing less than DC600M from same manufacturer with no mirrors. I get criticism there, but here, you did well @Daniel_Dog, in my opinion.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!