Are two PVE servers a bad idea?

chudak · Jul 14, 2023

Hello all

I have a home lab and run PVE for several years now with several VMs/CT.
It's on Intel NUC i7 64GB

I am thinking to add one more similar server to add more reliability, mostly to prevent issues due to failed upgrades, migrate VMs and such.

I hear that 2 servers are not a good idea.

Can people who run more than one PVE share feedback and maybe give advice on this?

TIA

Dunuin · Jul 14, 2023

A cluster needs at least 3 hosts. Either 3x PVE nodes or 2x PVE nodes + 1x qdevice (some other machine that is also running 24/7 in your LAN to work as a third voter...a VM of a NAS, SBC, Thin-Client or what ever).
If you don't want 3 hosts you could run two unclustered individual nodes and for example run a PBS LXC on both of them and set up sync jobs so those two PBS LXCs will sync each other. If then one PVE node is failing, you at least got recent backups you could restore to the remaining node, so you could very quickly get your most important services to work again.

Spoonman2002 · Jul 14, 2023

chudak said:
Hello all

I have a home lab and run PVE for several years now with several VMs/CT.
It's on Intel NUC i7 64GB

I am thinking to add one more similar server to add more reliability, mostly to prevent issues due to failed upgrades, migrate VMs and such.

I hear that 2 servers are not a good idea.

Can people who run more than one PVE share feedback and maybe give advice on this?

TIA

It depends on what your possibilities are, moneywise and hardware etc.
You could save the vm/ct backups both on your PVE host AND on a NAS (or an external USB disk).
That's not totally foolproof but it is a start.

Personally I'm not a big fan of clustering.
Especially in a simple home lab, with no critical 24/7 need of 99% uptime.
Well, I don't need that anyway.
I use 3 seperate PVE hosts, where one of them is my "test" host (runs beta, non stable Proxmox).
On my NAS I run Proxmox Backup Server (as a vm).
My backup routine is 3 daily, 1 weekly and 1 montly backup of "important" vm/ct.

chudak · Jul 14, 2023

Spoonman2002 said:
It depends on what your possibilities are, moneywise and hardware etc.
You could save the vm/ct backups both on your PVE host AND on a NAS (or an external USB disk).
That's not totally foolproof but it is a start.

Personally I'm not a big fan of clustering.
Especially in a simple home lab, with no critical 24/7 need of 99% uptime.
Well, I don't need that anyway.
I use 3 seperate PVE hosts, where one of them is my "test" host (runs beta, non stable Proxmox).
On my NAS I run Proxmox Backup Server (as a vm).
My backup routine is 3 daily, 1 weekly and 1 montly backup of "important" vm/ct.

I do backups as well. I am a bit paranoid about the safety of my proxmox as I do almost all business/personal things on it.

I wish we had something like I was using here https://forum.proxmox.com/threads/a-la-pfsense-boot-environments.129868/

I also like your point about instead of having cluster simply have two independent PVE servers.

(Thinking aloud - is it easy to copy one VM/CT from server 1 to server 2?))

Dunuin · Jul 14, 2023

chudak said:
(Thinking aloud - is it easy to copy one VM/CT from server 1 to server 2?))

PVE 7.4 added cross-cluster migration as a feature preview. With that (cli only using the "qm remote-migrate" command) you can migrate VMs between unclustered PVE hosts. Or you just do a backup on host A + restore on host B.
The most annoying thing in my opinion is, that security groups, IP sets, aliases, users, tokens, privileges, roles, jobs and so on are part of the datacenter and not included in any backups. Without a cluster it gets hard to keep all that manually in sync. I got 4 unclustered nodes here and I need to do everything 4 times and this isn't that easy because not all 4 servers are running 24/7. So I for example change an IP of an alias on the 2 running PVE nodes and want to change that on the two powered off VE nodes later that week. Then I forget to do this and migrate some months later a VM to one of those Nodes with the unchanged alias and wonder why my firewalls aren't working as expected, as the same alias is refering to different IPs.

Spoonman2002 · Jul 15, 2023

(Thinking aloud - is it easy to copy one VM/CT from server 1 to server 2?))

I use Proxmox Backup Server, and it is dead simple to restore vm/ct backups to PVE A and/or PVE B.

chudak · Jul 15, 2023

Dunuin said:
...
The most annoying thing in my opinion is, that security groups, IP sets, aliases, users, tokens, privileges, roles, jobs and so on are part of the datacenter and not included in any backups. .....

I see
I don't do anything more than default settings for "that security groups, IP sets, aliases, users, tokens, privileges, roles, jobs and so on", in my case will defaults in your example restore the same VM from A to B?

chudak · Jul 16, 2023

Dunuin said:
PVE 7.4 added cross-cluster migration as a feature preview. With that (cli only using the "qm remote-migrate" command) you can migrate VMs between unclustered PVE hosts. Or you just do a backup on host A + restore on host B.
The most annoying thing in my opinion is, that security groups, IP sets, aliases, users, tokens, privileges, roles, jobs and so on are part of the datacenter and not included in any backups. Without a cluster it gets hard to keep all that manually in sync. I got 4 unclustered nodes here and I need to do everything 4 times and this isn't that easy because not all 4 servers are running 24/7. So I for example change an IP of an alias on the 2 running PVE nodes and want to change that on the two powered off VE nodes later that week. Then I forget to do this and migrate some months later a VM to one of those Nodes with the unchanged alias and wonder why my firewalls aren't working as expected, as the same alias is refering to different IPs.

I also am thinking if I restore a VM that runs some service (say Emby) etc. What will happen to network settings, IP etc…

Dunuin · Jul 16, 2023

chudak said:
I don't do anything more than default settings for "that security groups, IP sets, aliases, users, tokens, privileges, roles, jobs and so on", in my case will defaults in your example restore the same VM from A to B?

Then you don't run your guests as secure as you could.

Always recommended to use users with as striped down privileges as possible, with only the privileges they really need. Same for ports. All ports should be blocked with manual whitelisting of ports and only for IPs that really need access. When doing that you need all those aliases, security groups, tokens, roles and so on.

chudak said:
I also am thinking if I restore a VM that runs some service (say Emby) etc. What will happen to network settings, IP etc…

With network shares usually not problematic as long all nodes got the same bridges part of the same subnets. LXCs and bind-mounts would be problematic.

Ramalama · Jul 16, 2023

Dunuin said already everything, but if im allowed to add my opinion:

As preword: Money isn't unlimited in a Homelab, so:

I think 2 Servers are neccessary and for homelab enough, just add one quorum device and thats it.
Why do i would stick in an homelab to 2 Servers?
- Well you can run Opnsense in HA!

Its important, because everytime you do something on server 1, you still won't loose your internet connection.
I personally run everything on proxmox and throwed all the physical devices away, thats why i mention opn/pfsense.

The only situation i would run a proper 3node Cluster at home is, if i would have 3 Nucs and use their Thunderbolt ports as 20gb/s FDX links for Ceph.
Thats like an Raid5, but with 3Nucs, instead of 3 Drives in one System xD

I personally run a Big-Little Combo,
- Means one really expensive (but powersave) big Server: 5800X / 128gb ecc / 8x20TB Z2 HDD / 4x 990Pro-2TB Raid10 / Arc A380 / etc...
- One Little Server: Nuc13ANHI3 with one 1TB SSD Drive and one 990 Pro xD

The small one i need only to keep my internet connection and VPN in case something happens or i do sth with the big server.
Means the small Server runs another Opnsense Instance for HA + another Pihole instance with keepalived (HA)

I personally prefer the Big-Little Combo for Homelab, as in my case, Opnsense HA + Pihole HA do a failover in lightspeed compared to a proper Proxmox HA Cluster without a shared Storage.
(My big-little combo runs in a cluster either, just without HA as thats not really needed here, but i have an additional quorum anyway, otherwise if one node is down and you reboot the other node, vms wont get started xD)

TBH, for homelab it's actually not that critical what you run, basically thats anyway your playground, you can play with an 3node cluster, you can even do a single node in some environments, you can do whatever you like.

But for the sake of it, you should at least try a proper 3node cluster yourself out, just for your own experience and fun.

Cheers

Dunuin · Jul 16, 2023

Ramalama said:
Why do i would stick in an homelab to 2 Servers?
- Well you can run Opnsense in HA!

Jup, wouldn't want to run my OPNsense VMs without that. So much complexity and stuff that could break and usually would be really annoying if your only PVE server fails and your complete household would be offline for 1 oder 2 weeks because you need to debug the problem and maybe order some replacement parts and find some time to fix stuff.
Similar problem when running critical services like a virtualized NAS where you store all your important files in a centralized location with bit rot protection and redundancy instead of storing stuff locally on the unreliable NTFS partitions. Would be really bad if you need to access some important files but you can't start that NAS VM for 1 or 2 weeks. with a second node and proper backups you could as least restore that VM on the second node that is still working to access your stuff, while fixing the failed server.

Ramalama said:
I personally run a Big-Little Combo,
- Means one really expensive (but powersave) big Server: 5800X / 128gb ecc / 8x20TB Z2 HDD / 4x 990Pro-2TB Raid10 / Arc A380 / etc...
- One Little Server: Nuc13ANHI3 with one 1TB SSD Drive and one 990 Pro xD

Same here. Saves me 500€ or so on electricity per year with the little server running 24/7 (with all critical services like router, monitoring, smarthome, wiki, pihole, log server, nextcloud for calendar/contacts/bookmarks/password safe/todo/notes syncing, reverse proxy, guacamole) and the big one only running on demand when needed (running NAS, media server, document management system, game servers, ... all the stuff I don't really need when not being a home or while sleeping).

alexskysilk · Jul 16, 2023

Dunuin said:
Jup, wouldn't want to run my OPNsense VMs without that.

Since you would need a dedicate connection from each host to the physical router anyway, you'd be better served by having a pfsense vrrp cluster. it wouldnt be beholden to your parent host, and you wont have an outage in case of host failure while the host gets fenced and the copy spun up on the other host.

Dunuin said:
with a second node and proper backups you could as least restore that VM on the second node that is still working to access your stuff, while fixing the failed server.

that would depend on having sufficient spare space to restore your backup. unless you get free equipment and storage, its hard to justify this cost and power expenditure for a once in a blue moon outage. its a home lab, most of the stuff on there is completely replaceable and not needed on a random access basis.

Dunuin · Jul 16, 2023

alexskysilk said:
Since you would need a dedicate connection from each host to the physical router anyway, you'd be better served by having a pfsense vrrp cluster. it wouldnt be beholden to your parent host, and you wont have an outage in case of host failure while the host gets fenced and the copy spun up on the other host.

I already use pfsync/CARP for failover. Works great with basically no interruption.

alexskysilk said:
that would depend on having sufficient spare space to restore your backup. unless you get free equipment and storage, its hard to justify this cost and power expenditure for a once in a blue moon outage. its a home lab, most of the stuff on there is completely replaceable and not needed on a random access basis.

Depends...if it is just a homelab for learning and testing that might be fine. But if you really rely on it because it's a "productive" homeserver (I for example digitalize all my papers and destroy most of the physical letters, so if I need to access some documents, I really need that NAS and Docker VM working...or at least access to the backups) I still want full redundancy. I for example got identical disks with both servers and both servers run a TrueNAS VM and data is kept in sync with replication. So I don't have to care if a server fails and I still got a recent copy of everything easily accessible. You should have a backup of everything important to you anyway, so shouldn't be a problem to buy all disks 2 or 3 times and put them in different machines. And when buying similar servers RAM and CPU isn't a big problem. I could move CPUs, RAM, NICs, HBAs, disks and so on between the servers. So the backup server only needs minimal hardware and in case the main server fails I can just move most of the hardware from the main to the backup server and restore the VMs there.

Ramalama · Jul 16, 2023

alexskysilk said:
Since you would need a dedicate connection from each host to the physical router anyway, you'd be better served by having a pfsense vrrp cluster. it wouldnt be beholden to your parent host, and you wont have an outage in case of host failure while the host gets fenced and the copy spun up on the other host.

Why?
In my case it's an pppoe Modem.

I simply connected my pppoe Modem (vigor 167) to the central switch here...
And it's on a separate vlan, or it's own Vlan, where only the opnsense has access to it.

Would use the separate vlan method for any other connection to the ISP Modem either.

And the rest is a normal Opnsense HA Setup with carp on all interfaces (or almost all...)

Opnsense is even that clever that you can simply sync almost everything between both instances and the "differences" that you still need to keep, gets keeped, its actually pretty amazing.
But off topic

Ramalama · Jul 16, 2023

Dunuin said:
Same here. Saves me 500€ or so on electricity per year with the little server running 24/7 (with all critical services like router, monitoring, smarthome, wiki, pihole, log server, nextcloud for calendar/contacts/bookmarks/password safe/todo/notes syncing, reverse proxy, guacamole) and the big one only running on demand when needed (running NAS, media server, document management system, game servers, ... all the stuff I don't really need when not being a home or while sleeping).

That's a bit different as i run everything on the big one xD
But i like your idea and makes absolutely sense.

Just in my case the Big Server with 8HDD's and everything, takes around 115W...
Which i didn't find that much...
The NUC i never measured, but for sure that's in the range of 15w at max

However i need the Big thing here, because it gets accessed all the time, at the midday for work stuff, at the evening for movies etc xD
I mean the Storage especially

If i wouldn't need the Storage all the time, i would for sure use the small one more xD

Dunuin · Jul 16, 2023

Ramalama said:
ust in my case the Big Server with 8HDD's and everything, takes around 115W...
Which i didn't find that much...

Here in Germany thats 402 € / 451 USD per year.
Shut it down the 8 hours while sleeping for 10 years and that's 1340 € / 1504 USD saved.

Ramalama · Jul 16, 2023

Dunuin said:
Here in Germany thats 402 € / 451 USD per year.
Shut it down the 8 hours while sleeping for 10 years and that's 1340 € / 1504 USD saved.

Stimmt :-(

chudak · Jul 16, 2023

I still emotionally can't agree with a decision to run pfSense on a VM
Too much risk IMHO

Reminds me a matreshka

Ramalama · Jul 16, 2023

chudak said:
I still emotionally can't agree with a decision to run pfSense on a VM
Too much risk IMHO

Reminds me a matreshka

Emotional Damage xD

Everyone to his likings, but tbh, why not?
See the benefits, i have for example the Cpu Power of an Ryzen 5800x. Thats a lot better as some dedicated box with an Atom xD
Its more energy efficient either, since proxmox runs anyway 24/7.
You have less hardware to worry xD
You can do High Availability xD

I run it semi-virtual tho, means i passthrough always if i can an physical Nic Port to my Opnsense and do vlans there etc...
Gives me the benefit of activating some hardware acceleration either.
On my X550-T2, i use one 10GBe port for Proxmox and the other i passthrough to Opnsense completely, used earlier sr-iov for that, but i gone away from sr-iov and passthrough the whole physical port, which works a LOT better.
Have not to fiddle anymore with bridge fdb tables or any other issues i had with virtual functions.
And it allows me to use all hardware acceleration, which work btw flawless! (Just if you use surricata then you have to disable some hw-acc)

However, the benefits at least for me outperforms the downsides, if there are any at all.
Maybe the still broken ballooning driver in pf/opnsense is a downside, but i simply deactivate it anyway on freebsd....

Cheers

chudak · Jul 16, 2023

Ramalama said:
.....

However, the benefits at least for me outperforms the downsides, if there are any at all.
Maybe the still broken ballooning driver in pf/opnsense is a downside, but i simply deactivate it anyway on freebsd....

Cheers

I did mess up pve upgrade 7 to 8 last week (it was not dramatic, but took me a day to resolve).

I was able to get by without my VMs, but...

I can not stomach not having the internet alone with my VMs.

That's actually why I started this thread - to understand how to build a bulletproofed pve infrastructure.

Are two PVE servers a bad idea?

Active Member

Distinguished Member

Active Member

Active Member

Distinguished Member

Active Member

Active Member

Active Member

Distinguished Member

Well-Known Member

Distinguished Member

Distinguished Member

Distinguished Member

Well-Known Member

Well-Known Member

Distinguished Member

Well-Known Member

Active Member

Well-Known Member

Active Member