what could be a good parttern for hardening proxmox and setup the network ?

benoitc

Member
Dec 21, 2019
173
8
23
I am looking for somt guidance to make my installation of proxmox more "secure" and resilient .

  • I have a cluster of 3 machines with each 2 10G ports.
  • Storage is managed via a NAS connected through a 10 GB port (it also has 2 1GB port)
  • All the machines are connected to a 10G switch
  • I have Public IPv4s and a /56 IPv6 prefix to route to some of the vms

What I plan to do is the following:
  1. Have Proxmox interface in its own sub-network
  2. Allows clustering to work over the two interface. I plan to have 2 subnets for it
  3. Dedicate Storage to one subnet and prioritise corrosync to the other
  4. Routing using SDN


For 1) I m not sure how to do it for should I create a dedicated bridge for it? Is this possible?

For 2) and 3) I have found the following documentation : https://pve.proxmox.com/wiki/Cluster_Manager#pvecm_redundancy , is there anything more to do?

For 4) are there any docs around that show how to pass an IPv6 prefix an an IPv4 block?
 
Hi,

only to prove I understand you correctly.

The base network setup is
2 x 1 GBit port
2 x 10 Gbit port
but you like to use only the two 10Gbit ports?
and these ports are connected to one 10Gbit switch?
 
Hi,

only to prove I understand you correctly.

The base network setup is
2 x 1 GBit port
2 x 10 Gbit port
but you like to use only the two 10Gbit ports?
and these ports are connected to one 10Gbit switch?
Right I was unclear. The setup is the following:

* each proxmox nodes has 2 10Gbit ports,
* All proxmox nodes are connected to a 10GBit switch
* Storage has 1 10GB port connected to the switch, and 2 1GB ports connected to another switch. This switch itself is connected to to the 10Gbit switch using a 10Gbit link

Hope it helps :)
 
In this case, your network is a single point of failure, and it isn't easy to compensate for this.

The problem that I see in this setup is.
When the storage network becomes pressure the switch could easily overload and will not transport the other packages as fast it is necessary.

Corosync is very sensible about network latency.
Also, if you separate the networks with two nic but use the same switch, the bottleneck will be the switch.

In other words, I would recommend a redundant network to make your setup "resilient"

If you are fine with this single point of failure and you have only security concerns?
I would do the following.

Use a simple network setup.
One vmbr with an IP and use the Building firewall to block all not necessary ports.[1]
Additionally, you can install fail2ban to improve security.[2]


1.) https://pve.proxmox.com/wiki/Firewall
2.) https://pve.proxmox.com/wiki/Fail2ban
 
  • Like
Reactions: benoitc
by redundant network you mean a second switch? i guess i could do that. i was also thinking to use ceph for most vms instead of only relying on nas.
 
by redundant network you mean a second switch?
Yes, e.g. 1 Gbit failback. Is it possible to add another 10Gbit to the NAS?

If you like to go the ceph way, I recommend another two-port Nic with min 10Gbit or better use 25GBit.
25Gbit is not more expensive than 10Gbit. The cost comes with the switch.
But for 3 nodes, you need no extra switch.

But for Ceph, you also need good Disks. What do you have at the moment?
Also, keep in mind ceph need memory and CPUs.
 
unfortunately the nas has only 1 10GB port ... i can do the fallback on 1G, i will buy another 10G switch to ensure redundancy. it's a little too late to add 25GB I will think about it on the next upgrade :)

For the HDs they are reasonably good, each nodes have 2 Samsung SSD PM883 960 GB connected to them. system is on another m2 nvme ssd. I guess it's OK for ceph? Also each nodes has 64 GB DDR4 ECC of ram and a Xeon D-2141I .

Anyway thanks a lot for the useful infos !
 
Last edited:
I guess this will work but you are not satisfied.
You would need more memory for ceph and the separate network are strict requirements.
This is a ceph benchmark with this Samsung disk.[1]
The CPU will stong enough.

The problem with this network still exists if you have no second port for the NAS.
The network is not the single point of failure but the NAS still is.

This MB what has the Intel Xeon D-2141I has normally one PCIe slot free.
Has your Server not the capacity to use this slot?

1.)https://www.proxmox.com/en/downloads/item/proxmox-ve-ceph-benchmark
 
  • Like
Reactions: benoitc
I guess this will work but you are not satisfied.
You would need more memory for ceph and the separate network are strict requirements.
This is a ceph benchmark with this Samsung disk.[1]
The CPU will stong enough.

The problem with this network still exists if you have no second port for the NAS.
The network is not the single point of failure but the NAS still is.

This MB what has the Intel Xeon D-2141I has normally one PCIe slot free.
Has your Server not the capacity to use this slot?

1.)https://www.proxmox.com/en/downloads/item/proxmox-ve-ceph-benchmark
unfortunately the second pcie is used by the nvme disks... maybe I should have took 2 other disks instead :/ right now so i have2 10g ports. I can dedicate 1 for the storage. But now I’m unsure of what could be the best setup between the nas or using ceph.. I guess i will have to benchmark.
 
I guess this will work but you are not satisfied.
You would need more memory for ceph and the separate network are strict requirements.
This is a ceph benchmark with this Samsung disk.[1]
The CPU will stong enough.

The problem with this network still exists if you have no second port for the NAS.
The network is not the single point of failure but the NAS still is.

This MB what has the Intel Xeon D-2141I has normally one PCIe slot free.
Has your Server not the capacity to use this slot?

1.)https://www.proxmox.com/en/downloads/item/proxmox-ve-ceph-benchmark
so i guess it won't be as fast as it could be but I came to the design below with the current hardware. Any last feedback/hint is appreciate as I am still unsure which performances i sjhould expect from such config. This afternoon promised to be interresting :)

hardware, each nodes has the following properties:

- CPU Intel Xeon D-2141I 12C/ 24 T
- 64GB DDR-4 2666Mhz ECC LR,
- System on 2 NVME disk in mirrror
- 2 x Samsung SSD PM883 480 GB
- 2x 10G ports

deployement:

- ceph will be installed and grow over each separate SSD disks
- ceph will use a separate subnet handled by 1 one port on each nodes. These ports will be connected to a dedicated 10G switch A
- sync and other cluster and internet related stuff will go over the other 10G port connected to a dedicated 10G swicth B
- on the switch B there will be the 10G switch. This will be used for stuff that can survive a crash.

Does it make sense? I am still puzzled if i should use ceph with such configuration?

I can see the possible optimisations/changes in hw but unsure:

* increase RAM to 128?
* remove the dedicated system raid on NVME and replace it by a SATA SSD DOM? Then add a 25GB card or another dedicated 10GB card? still use ceph
* remove the dedicated system raid , add another 10GB card and use nas with 2 10G ports
* use a NAS with a 2 10G ports and not use ceph. >Maybe adding just another port would allows to use both?


Thoughts?
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!