Network Advice

siil-itman

New Member
Jan 18, 2026
3
2
3
We are setting up a test lab in work with a view to an enterprise deployment if things go well to start moving away from our current hypervisor.
The initial setup is
HP ML 10's (old model)
32Gb RAM
4x 4Tb disk RAIDZ1
4x 1G Nic

We have two of these servers in a cluster.

The 4 network ports on each server are connected as follows
nic0 - connected to external network 1 and currently has linux bridge vmbr0 containing the IP address we manage each server through
nic1 - connected to external network 1, no IP assigned and not added to a bridge/bond/vlan
nic2 - connected to external network 2, no IP assigned and not added to a bridge/bond/vlan
nic3 - connected to external network 3, this will be used for iSCSI into external storage

My questions relate to the network configuration.
Should we be configuring on each server or using the SDN?
Can a linux bridge use more than one ethernet connection or should we be using a bond?
If we configure on each server, how do I setup multiple gateways or is it going to be better using VLAN?
If we are using SDN, can I bind specific zones/vnets/subnets to a specific interface?

Sorry for the noob questions but I want to get this up and running quick to prove the concept to management so I can get the budget released for a production deployment (want new servers and storage) + the training courses.
 
Hello,

some information is missing from your post: What kind of storage discs are you using? How is your external storage setup (mirror,hw raid etc)?

RAIDZ (especially on hdds) is not suitable for hosting VMs but might be ok for the operating system and bulk data see: https://forum.proxmox.com/threads/fabu-can-i-use-zfs-raidz-for-my-vms.159923/


In general 1G is quite low for a storage network you might have better performance if you use local SSDs (dc-grade with power-loss protection) in a ZFS mirror as VM storage. Then you could use storage replication to have a kind of "pseudo-shared" storage with HA:
https://pve.proxmox.com/wiki/Storage_Replication

It's ok for a dedicated cluster network though as recommended in the manual:
https://pve.proxmox.com/wiki/Cluster_Manager#pvecm_cluster_network

So you would connect your nodes on their own dedicated network on NIC2 or NIC3, then you would add the nics 2,3 and 4 to have redundancy. This is important (details in linked wiki page) since the cluster/corosync-communication is quite latency-sensitive and don't like to have additional traffic on it's network. With a dedicated link and additional links (which can also serve as management and storage networks) you ensure that the cluster nodes can always communicate with each other.

Also: For high-availability your cluster needs to have three nodes to avoid a split-brain scenario. If you can't afford a third server this might be mitigated with a quite small device running Debian Linux and a voting daemon (qdevice):
https://pve.proxmox.com/wiki/Cluster_Manager#_corosync_external_vote_support

If you don't need cluster-specific features like storage-replication or high-availability using the Proxmox Datacenter Manager (can be installed as VM on one of the nodes) for migration and accesing the nodes from a single managment interfaces might be more useful for your usecase. And of course you can also add it to a cluster.

Another things to consider: Do you have a plan for backups? It might be worth to also add ProxmoxBackupServer to your scope of your evaluation. Since PBS (like PVE) is basically Debian you can also use it as qdevice. To strictly seperate backups and your hypervisor I would recommend setting it up like described by Proxmox developer Aaron in this post:

Adding to the documentation of the cluster you might also want to read following regarding Ceph:
https://pve.proxmox.com/wiki/Deploy...r#_recommendations_for_a_healthy_ceph_cluster

Although your current setup plans to use an external storage a HCI-infrastructure might be of interest for your next hardware renewal.
With your current network it propably won't be worth it though since Ceph needs at least a 10Gbit/link see also https://forum.proxmox.com/threads/fabu-can-i-use-ceph-in-a-_very_-small-cluster.159671/ and the discussion. Please note that Udo assumes in his writeup that somebody wants to use Cephs auto-healing features which would need at least four nodes to survive the outage of two nodes. If your main usecase for a cluster is to have continuing operation during maintenance or only want to be able to survive the outage of one node, three nodes might still be good enough. Several pros reported here, that they have customers who are quite happy with their small clusters (two node+qdevice+external storage or two nodes+qdevice+zfs replication or three nodes+external storage or three nodes +ceph or a combination of everything).

Hope that helps, regards, Johannes.
 
  • Like
Reactions: UdoB
The helps a lot Johannes. This is only an initial test setup we are creating using some old spare servers so I can do a PoC for management and setup a dev/lab environment. Your links have given me enough information to hopefully resolve my current niggles

Based on the info your have provided, we will look to build our production setup with at least 3 or 4 servers (i like the idea of Ceph), multiple 10G networks, and an attached SAN or iSCSI NAS for the shared storage.

Once question for the host systems disk, I see the problem with ZFS, is BTRFS a better option or should we just stick with RAID10?
 
  • Like
Reactions: Johannes S
Based on the info your have provided, we will look to build our production setup with at least 3 or 4 servers (i like the idea of Ceph), multiple 10G networks, and an attached SAN or iSCSI NAS for the shared storage.

If you renew your hardware anyhow I would consider to have even faster networks like 25G or even 100G (depending on your budget) to have some room for growth. @Falk R. mentioned several times that for new setups he usually go with 100G if his customer budget allows and at least 25G. Maybe he can chime in how to size for a new environment. A combination of external shared storage and Ceph is of course possible, I remember that the Proxmox website has a success story from American hosting provider HorizonIQ ( https://www.proxmox.com/en/about/about-us/stories/story/horizoniq ) who daily-drive a combination of Ceph and an attached shared flash storage for latency-sensitive workloads. Although their cluster is a "little" bit larger than yours (they have 19 nodes)


Once question for the host systems disk, I see the problem with ZFS, is BTRFS a better option or should we just stick with RAID10?

btrfs is in theory a kind of re-implentation of ZFS to avoid the licensing issues which prevents the inclusion of zfs in the default kernel. For ProxmoxVE and ProxmoxBackupServer this doesn't matter that much though since the Promox team builds their own kernel (based on Ubuntus) with ZFS support.
A company (like Canonical or Proxmox) can decide to live with the legal risc, while community projects like the Linux Kernel or Debian tend to have a "better safe than sorry"-approach.

The thing with btrfs is that it still lacks some parts (for example mirrors and striped mirrors (RAID1/RAID10) work, RAID3/5 etc are experimental and might cause data loss) and the support in PVE is still consider "technology preview". Nontheless people use it, since the RAID1/RAID10 part is mature enough.

On the other hand of you have a reason to avoid ZFS which isn't the licensing the same technical reasons are also true for btrfs, since both works similiar being copy-on-write filesystems. So I would stick with ZFS (since it's better integrated in ProxmoxVE) due to it's advanced features until you want to use HW raid. But as far I know at least part of the features-et are also in btrfs (like transparent compression)

May I ask whether you are using RAIDZ1 (which is like HW RAID3) or a mirror (HW RAID 1) or striped mirror (RAID10)? Mirrors or striped mirrors are actually the best option to get maximum performance out of your storage with zfs, so building a striped zfs mirror out of your four discs is actually the best way to go. Of course (like with Ceph and ZFS in general) HW RAID is a big no then since the default configuration of ZFS doesn't play nice with HW raid (it needs direct access to the discs).
 
  • Like
Reactions: UdoB
We have setup RAIDZ1 because i've never used it before and saw it was an option so thought I'd give it a try. When we get the new servers based on what you've said and what I'm now reading, I will possibly go with a HW RAID10 if I'm not using ZFS.
Also, 25Gb for the network is going to be an option by the time we buy the production servers as the new core network should be live by then. The new storage we plan to buy has a 25G card available as well as fiber channel.
 
  • Like
Reactions: Johannes S
When we get the new servers based on what you've said and what I'm now reading, I will possibly go with a HW RAID10 if I'm not using ZFS.
That's surely an option, I personally would prefer to use ZFS features (like the bitrot-protection and compression):
 
  • Like
Reactions: UdoB