Pre-install feedback

Hans Gruber

New Member
Jul 25, 2024
4
0
1
I'm at the planning stage of moving my small company bare metal servers (3 Windows servers) over to a small Proxmox cluster.

Here is my initial plan, I would really welcome any feedback if anyone has time review it.

Goal: Build a Proxmox cluster comprising of two nodes and a QDevice to deal with quorum votes.
- Each of the nodes can support the entire VM stack when the other node fails.
- Replicate the VMs to the other node every 15 minutes.
- Backup to a remote PBS server over a 1GB leased line twice each day.

I need to host 3 Windows server VMs totalling around 500GB of data and 16GB RAM each. These are low bandwidth servers running AD, file shares and a SQL database app.

The main nodes have 2 NICs on the motherboard (1x2.5Gb 1x1Gb), I plan to use to the faster 2.5GB NICs along with a dedicated 2.5GB switch as a Proxmox dedicated network to sync data. And the 1GB NIC to connect to our (1Gb) LAN switch.

Here's the proposed server specs and IP config:

PROX1
16 core / 64GB RAM
ZFS boot mirror 128GB SSD
ZFS storage pool mirror 2TB SSD
2.5GB Ethernet & 1GB ethernet
Dedicated APC UPS
NIC1 (1GB) 192.168.0.100 - LAN
NIC2 (2.5GB) 10.0.0.100 - CLUSTER
VM01 (192.168.0.50) -> replicated to other node
VM02 (192.168.0.52) -> replicated to other node

PROX2
16 core / 64GB RAM
ZFS boot mirror 128GB SSD
ZFS storage pool mirror 2TB SSD
2.5GB Ethernet & 1GB ethernet
Dedicated APC UPS
NIC1 (1GB) 192.168.0.101 - LAN
NIC2 (2.5GB) 10.0.0.101 - CLUSTER
VM03 (192.168.0.53) -> replicated to other node

QDEV1
RPi4 (8GB RAM)
Debian
QDevice
Storage: 128GB SSD (not RAID)
NIC (1GB) 10.0.0.3 - CLUSTER
Dedicated APC UPS

Questions
1. my understanding is that if I replicate the VMs to the other node every 15 minutes , HA will operate with a 15 minute window of dataloss?
2. If I install a PBS at a remote location I can configure backups to run twice each day to that location over a 1GB leased line?
3. Does anyone have experience using a RPi4 as a QDevice ? This feels a bit flaky to me, I could swap that out for an low end N100 device running Proxmox from a ZFS mirror if recommended.
4. Would I be better off downscaling the spec of the servers so that can have 3 identical servers less RAM and storage, and no Q-Device.
 
Please don't cheap out on the SSD storage and use proper enterprise ones with PLP (and you might not even need UPS for all nodes), especially be cause of changes (but not all of the writes) being replicated and using databases.
16 cores per node seems a bit overkill for at most 3 VMs, so you might save some money there and get 8 core Ryzens.
Do the VMs really need 16GB each? 64GB would run them fine but I would not do 3 nodes with 32GB each. You can safely over-commit on CPU but not on RAM.
Maybe separate the replication network from the corosync network as the former can eat a low of bandwidth and the latter needs low-latency (VLAN with switch priority maybe?).
 
  • Like
Reactions: Hans Gruber
Please don't cheap out on the SSD storage and use proper enterprise ones with PLP (and you might not even need UPS for all nodes), especially be cause of changes (but not all of the writes) being replicated and using databases.
16 cores per node seems a bit overkill for at most 3 VMs, so you might save some money there and get 8 core Ryzens.
Do the VMs really need 16GB each? 64GB would run them fine but I would not do 3 nodes with 32GB each. You can safely over-commit on CPU but not on RAM.
Maybe separate the replication network from the corosync network as the former can eat a low of bandwidth and the latter needs low-latency (VLAN with switch priority maybe?).
I literally had never head the term PLP before, that was a fun rabbit hole :-)

I've uprated the storage to use only PLP/DLP drives - great call, thanks.

I can also switch to 8core Ryzens, thanks again.

The VM's do not need 16GB each, they are just configure that way on the bare metal hardware where they currently reside. The AD server needs 8GB at most and the file server the same. I figured I could provision 3x16gb on a 64GB system and let the KVM memory ballooning take care of the provisioning, is that not correct? Also I was planing to put the database VM on node (A) and the less resource intensive VM node (B). They would only end up on the same node in an failover situation.

I need to read more on corosync latency, I figured running both corosync and replication the separate 2.5Gb switch would logical but I guess the bandwidth of the replication makes the latency unpredictable right? Would limiting the replication bandwidth below solve this?

I have this in my setup notes:

# Create a replication job which runs every 15 minutes with a limited bandwidth of 2000 Mbps (megabytes per second) for the guest with ID 100.
pvesr create-local-job 100-0 pve1 --schedule "*/15" --rate 2000

Do you have any opinion on the Qdevice vs an N100 3rd Proxmox node?
 
I'm at the planning stage of moving my small company bare metal servers (3 Windows servers) over to a small Proxmox cluster.

Here is my initial plan, I would really welcome any feedback if anyone has time review it.

Goal: Build a Proxmox cluster comprising of two nodes and a QDevice to deal with quorum votes.
- Each of the nodes can support the entire VM stack when the other node fails.
- Replicate the VMs to the other node every 15 minutes.
- Backup to a remote PBS server over a 1GB leased line twice each day.

I need to host 3 Windows server VMs totalling around 500GB of data and 16GB RAM each. These are low bandwidth servers running AD, file shares and a SQL database app.

The main nodes have 2 NICs on the motherboard (1x2.5Gb 1x1Gb), I plan to use to the faster 2.5GB NICs along with a dedicated 2.5GB switch as a Proxmox dedicated network to sync data. And the 1GB NIC to connect to our (1Gb) LAN switch.

Here's the proposed server specs and IP config:

PROX1
16 core / 64GB RAM
ZFS boot mirror 128GB SSD
ZFS storage pool mirror 2TB SSD
2.5GB Ethernet & 1GB ethernet
Dedicated APC UPS
NIC1 (1GB) 192.168.0.100 - LAN
NIC2 (2.5GB) 10.0.0.100 - CLUSTER
VM01 (192.168.0.50) -> replicated to other node
VM02 (192.168.0.52) -> replicated to other node

PROX2
16 core / 64GB RAM
ZFS boot mirror 128GB SSD
ZFS storage pool mirror 2TB SSD
2.5GB Ethernet & 1GB ethernet
Dedicated APC UPS
NIC1 (1GB) 192.168.0.101 - LAN
NIC2 (2.5GB) 10.0.0.101 - CLUSTER
VM03 (192.168.0.53) -> replicated to other node

QDEV1
RPi4 (8GB RAM)
Debian
QDevice
Storage: 128GB SSD (not RAID)
NIC (1GB) 10.0.0.3 - CLUSTER
Dedicated APC UPS
slope
Questions
1. my understanding is that if I replicate the VMs to the other node every 15 minutes , HA will operate with a 15 minute window of dataloss?
2. If I install a PBS at a remote location I can configure backups to run twice each day to that location over a 1GB leased line?
3. Does anyone have experience using a RPi4 as a QDevice ? This feels a bit flaky to me, I could swap that out for an low end N100 device running Proxmox from a ZFS mirror if recommended.
4. Would I be better off downscaling the spec of the servers so that can have 3 identical servers less RAM and storage, and no Q-Device.
1. Replication and Data Loss
Question: If I replicate the VMs to the other node every 15 minutes, HA will operate with a 15-minute window of data loss?

Answer: Yes, if you set up VM replication to occur every 15 minutes, there is a potential for up to 15 minutes of data loss in the event of a failure. This is because any changes made to the VMs in the interim period will not be reflected on the backup node until the next replication cycle. For critical applications where data loss is a concern, you might want to consider more frequent replication or using continuous data protection (CDP) solutions.

2. Backup to Remote PBS
Question: If I install a PBS at a remote location, can I configure backups to run twice each day over a 1GB leased line?

Answer: Yes, you can configure Proxmox Backup Server (PBS) to perform backups over a 1GB leased line. The frequency and duration of the backups will depend on the total volume of data and the changes between backups. Ensure that your backup window (time taken to complete the backup) fits within your operational requirements. You might also want to use incremental backups to reduce the amount of data transferred each time.

3. Using RPi4 as a QDevice
Question: Does anyone have experience using a RPi4 as a QDevice? Is it reliable, or should I use a low-end N100 device with Proxmox?

Answer: While the RPi4 can function as a QDevice, it might not be as reliable or performant as a low-end x86 device. The QDevice is critical for maintaining quorum in your cluster, and any failure here can impact the entire cluster’s ability to function correctly. Using a more robust and reliable device like an Intel NUC or similar low-end N100 device with proper storage and power redundancy (ZFS mirror and dedicated UPS) would be a more resilient choice.

4. Server Specifications and Cluster Design
Question: Would I be better off downscaling the spec of the servers to have three identical servers with less RAM and storage, and no Q-Device?

Answer: Having three identical servers without a QDevice simplifies the setup and provides better redundancy. Each node would contribute to the quorum, and you wouldn't need a separate QDevice. Here’s a revised setup idea:
 
  • Like
Reactions: Hans Gruber
3 Windows server VMs totaling around 500GB of data and 16GB RAM each.
ZFS storage pool mirror 2TB SSD
So I understand you have a total of about 1.5T of data. Assuming the mirrored ZFS storage pool has 2T free (which it won't!), that means you are already using 75% of the pool. That is about the limit any ZFS pool should be filled in such an environment.
And then you suggest:
with less RAM and storage
I actually think you need more storage.
 
  • Like
Reactions: Hans Gruber
1. Replication and Data Loss
Question: If I replicate the VMs to the other node every 15 minutes, HA will operate with a 15-minute window of data loss?

Answer: Yes, if you set up VM replication to occur every 15 minutes, there is a potential for up to 15 minutes of data loss in the event of a failure. This is because any changes made to the VMs in the interim period will not be reflected on the backup node until the next replication cycle. For critical applications where data loss is a concern, you might want to consider more frequent replication or using continuous data protection (CDP) solutions.

2. Backup to Remote PBS
Question: If I install a PBS at a remote location, can I configure backups to run twice each day over a 1GB leased line?

Answer: Yes, you can configure Proxmox Backup Server (PBS) to perform backups over a 1GB leased line. The frequency and duration of the backups will depend on the total volume of data and the changes between backups. Ensure that your backup window (time taken to complete the backup) fits within your operational requirements. You might also want to use incremental backups to reduce the amount of data transferred each time.

3. Using RPi4 as a QDevice
Question: Does anyone have experience using a RPi4 as a QDevice? Is it reliable, or should I use a low-end N100 device with Proxmox?

Answer: While the RPi4 can function as a QDevice, it might not be as reliable or performant as a low-end x86 device. The QDevice is critical for maintaining quorum in your cluster, and any failure here can impact the entire cluster’s ability to function correctly. Using a more robust and reliable device like an Intel NUC or similar low-end N100 device with proper storage and power redundancy (ZFS mirror and dedicated UPS) would be a more resilient choice.

4. Server Specifications and Cluster Design
Question: Would I be better off downscaling the spec of the servers to have three identical servers with less RAM and storage, and no Q-Device?

Answer: Having three identical servers without a QDevice simplifies the setup and provides better redundancy. Each node would contribute to the quorum, and you wouldn't need a separate QDevice. Here’s a revised setup idea:
Thank you for the excellent feedback. I will implement all of your advice.
 
So I understand you have a total of about 1.5T of data. Assuming the mirrored ZFS storage pool has 2T free (which it won't!), that means you are already using 75% of the pool. That is about the limit any ZFS pool should be filled in such an environment.
And then you suggest:

I actually think you need more storage.
You are right, I increased the storage to 3TB per node.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!