Any suggestions for my home setup? 2x DL360p, ZFS replication, HA(?)

Jon Lachmann

Well-Known Member
Apr 27, 2011
31
0
46
Hi all,

I am upgrading my home server setup. At the moment there is a bad solution of one server running virtualbox on windows with email and web on it.

I have ordered 2x HP DL360P G8 each with 48gb ram, 2x 8-core xeons and 2x 240 gb ssd. I have also ordered 2x LSI SAS 9207-8i cards in order to be able to run the drives in non-hardware raid.

My plan is to install proxmox on both machines, use ZFS do do RAID1 on both drives in each machine, to protect against data loss. Since they are all SSD i would not need any caching volume etc? The plan is to have 2-3 KVM guests running on one server, and replicate them to the other one. I do not expect too much in terms of changes to the volumes of the guests, so I have not invested in 10GBe, but could do if need be.

I would like to run HA, but from my understanding I need a third node in order for quorum to work. Would a raspberry pi work as suggested in the wiki? Are there any downsides? From my understanding I only need it to break the tie.

Another thing I wonder is, what are the implications of the warning on this page https://pve.proxmox.com/pve-docs/chapter-pvesr.html, "redistributing services after a more preferred node comes online will lead to errors."? I am not completely sure what is meant by that, could someone explain?

Thats all, I am happy to receive suggestions on this setup, what is good, what is bad etc.?

Thank you in advance!
 
Since they are all SSD i would not need any caching volume etc?
Adding more RAM instead of using a SSD for an L2ARC is most of the time the better solution. So yeah, no caching SSD.

I would like to run HA, but from my understanding I need a third node in order for quorum to work. Would a raspberry pi work as suggested in the wiki? Are there any downsides? From my understanding I only need it to break the tie.
Adding a QDevice is enough to the cluster is enough for that.

Another thing I wonder is, what are the implications of the warning on this page https://pve.proxmox.com/pve-docs/chapter-pvesr.html, "redistributing services after a more preferred node comes online will lead to errors."? I am not completely sure what is meant by that, could someone explain?
Sorry but I don't really know myself what is meant with this. Maybe someone else has more insight :)
 
  • Like
Reactions: Jon Lachmann
I would like to run HA, but from my understanding I need a third node in order for quorum to work. Would a raspberry pi work as suggested in the wiki? Are there any downsides? From my understanding I only need it to break the tie.

In general Aaron is right, a QDevice is enough and can be run on any modern Linux distro. Just needs to be external.

In general, though it's IMO better to use three sligthly "smaller" nodes than two "big" ones (performance, capacity -wise), if space and requirements allow it. This then allows HA without extra daemons, spreads the load better and allows to do maintenance on one node really easily, be it software maintenance (big upgrades, or kernel upgrades) or HW maintenance (failed disk, ..). This is a huge benefit IMO, and takes out quite some stress when doing aforementioned maintenance stuff. Just my two cents :)
 
  • Like
Reactions: Jon Lachmann
In general Aaron is right, a QDevice is enough and can be run on any modern Linux distro. Just needs to be external.

In general, though it's IMO better to use three sligthly "smaller" nodes than two "big" ones (performance, capacity -wise), if space and requirements allow it. This then allows HA without extra daemons, spreads the load better and allows to do maintenance on one node really easily, be it software maintenance (big upgrades, or kernel upgrades) or HW maintenance (failed disk, ..). This is a huge benefit IMO, and takes out quite some stress when doing aforementioned maintenance stuff. Just my two cents :)
I hear that, it seems to be in line with most other suggestions. Maybe we will add a third (proper) node in time, but for now the project is already a bit over budget. In the case of hardware maintenance, a failed disk for example, how does a third node help? Is it just to take the load off of the other still working ones? To have redundancy during the maintenance? Or are there more benefits?

Do you know anything about the "redistributing services after a more preferred node comes online will lead to errors."? I am really curious about what this implies.

Hi,

What services do you want to run in your VMs? What OS will be?
The plan for now is to run one Ubuntu server with VestaCP for web and proxy, and another Ubuntu server for mailcow. There might be a need of a Windows Server (2008 R2 maybe) to run some services for a short transition time, but the plan is to not have to run Windows at all.

For now the pfSense is used as firewall on a separate machine. This has worked well, but maybe it would be a good idea to also run this as a VM? How would you do networking for that? (Both servers have a 4x gbit NIC and incoming internet is Ethernet with static IP), Maybe a second switch?
 
In the case of hardware maintenance, a failed disk for example, how does a third node help? Is it just to take the load off of the other still working ones? To have redundancy during the maintenance?


Any operation on the cluster like start/stop/create a vm, live migration need quorum(2 operational nodes from 3 in your case). So with only 2 nodes it is very easy to have a split brain scenario.
 
So with only 2 nodes it is very easy to have a split brain scenario.

I think it was just a wording mistake and you know your stuff, but quorum exists to explicitly to avoid split brain issues :)
So with two nodes one looses quorum immediately if one node goes down, yes, but as long as no manually hacking around is then done one is 100% safe from split brain.
 
Last edited:
So with only 2 nodes it is very easy to have a split brain scenario.

I think it was just a wording mistake and you know your stuff, but quorum exists to explicitly to avoid split
Is it just to take the load off of the other still working ones? To have redundancy during the maintenance?

It helps really for both things, spreading out load and having further redundancy.

The other node may need to take over all the services during that time (if your environment requires that) and with two nodes + qdevice, each of the two nodes need to be able to host the whole, total, load of the other in addition to the one itself.
While with three full nodes the only need to cover half of that additional load. Depending on what runs in the cluster that may not be an issue, it's just something one should have in mind, as else one may want to max out both nodes almost 100% and once something fails the other one gets overloaded.

Do you know anything about the "redistributing services after a more preferred node comes online will lead to errors."? I am really curious about what this implies.

That is (was?) only for the cases where HA groups with priority where explicitly set, I do not have the replication code and design fully in mind, but HA always uses the common API migration code path to move VMs/CTs, so this really should not be an issue.
I'll re-check this and remove the warning if that I'm right.

How would you do networking for that? (Both servers have a 4x gbit NIC and incoming internet is Ethernet with static IP), Maybe a second switch?

What often is done for that is to create a second bridge with the Ethernet port where the incoming internet service is connected as master, that's the WAN side.
Then add two NIVs to the pfSense VM, one on the default bridge connected to LAN (and the other VMs), the other on for the newly created bridge as WAN side for pfSense.

The drawback is that this VM is then bound to that server, except if you add a switch a create the same bridge setup also on the other PVE node.
Another variant would be using VLANs, then you could work with a single switch (if it supports VLANs), the idea could stay the same.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!