Proxmox VE CEPH cluster build

daniel.h · May 18, 2020

Hi all!

We are in the planning phase for a Proxmox-VE-cluster to expand our current IT-services at our institute. The main goal for now is to provide a basic programming environment and a web-based User-Interface for up to 100 concurrent users (teachers, students, institute staff) at times. User software is subject to discussion for now, but tests are going on with jupyterlab and sandstorm on ubuntu-20.04-LTS-server VMs, which could make a very nice combination with user-platform independence.

The hardware setup currently considered is the following (not yet decided specs given as spans):

+ COMPUTING
++ 2 'application nodes' (1 active/ 1 backup), 64-128 threads, 512-1TB RAM, SAS-SSD-RAID1 256GB for proxmox, 6 NICs (4x 1Gb, 2x 10Gb RJ45), LOCAL RAID for images (?)
++ 7 'storage nodes' CEPH, 16 threads, 16GB RAM, SAS-SSD-RAID1 256GB for proxmox, 4x2TB SATA-SSDs for the CEPH-OSDs, 6 NICs (4x 1Gb, 2x 10Gb RJ45)
+ NETWORKING
++ 1Gb Switch 20x RJ45 (Proxmox Control Network webinterface/ssh)
++ 1Gb Switch 20x RJ45 (Proxmox corosync 1)
++ 1Gb Switch 20x RJ45 (Proxmox corosync 2)
++ 10Gb Switch 20x RJ45 (Proxmox CEPH public network)
++ 10Gb Switch 20x RJ45 (Proxmox CEPH cluster network)

The model we found here (https://pve.proxmox.com/wiki/Deploy_Hyper-Converged_Ceph_Cluster) is highly attractive to us, but we (for now) do not have the hardware to test the concept, that's why we want to discuss the concept here in the Proxmox community, and see what you people think!
That said, we already run another system with 6 CEPH nodes with a safe storage of 100TB on bare metal nodes. The above will yield much lower (16TB at a replication of 3), but should be enough for our use case (it's calculated to provide 100GB/user max, plus 6TB for images, isos and snapshots (possibly split up into CEPH pools).

The above link describes the construction to be capable of running the VM-images straight from CEPH-storage, what is your experience with that? From an admin's point of view this is very tempting for VM-migration and storage overview -- what about performance compared to placing the VMs onto, say, a separate (local) SSD-RAID (SATA or SAS, Nvme is out of financial reach [that's the use of 'unclustered' Proxmox-VEs we have most experience with and is very good])?

The other big question is VM-distribution/count. We currently look at two models - two separate big VMs (like 20 threads, 256GB RAM for jupyter and sandstorm each), a couple of small ones (1 thread, 512M RAM) for administrative services like LDAP. This could be the easiest way to maintain the system but brings some 'security' issues as we will at least partly have to expose a shell to offer a 'real experience'. The other way would be to provide a small (like 1 threads, 8GB RAM) VM to each user, which eliminates the one-person-can-freeze-the-system part, but creates quite some extra-work in the areas DNS- and authentication-management (also, more than 2 application nodes would then be necessary). What do you think?

I'm looking forward to your input/thoughts/criticism!
Best regards,
Daniel

Alwin · May 18, 2020

daniel.h said:
The model we found here (https://pve.proxmox.com/wiki/Deploy_Hyper-Converged_Ceph_Cluster) is highly attractive to us, but we (for now) do not have the hardware to test the concept, that's why we want to discuss the concept here in the Proxmox community, and see what you people think!

Proxmox VE can also run nested and would give you the ability to dry-run your setup with virutal Proxmox VE nodes.

daniel.h said:
++ 7 'storage nodes' CEPH, 16 threads, 16GB RAM, SAS-SSD-RAID1 256GB for proxmox, 4x2TB SATA-SSDs for the CEPH-OSDs, 6 NICs (4x 1Gb, 2x 10Gb RJ45)

That is not enough memory, since OSDs will use 3-5 GB each. See our Ceph precondition [0] section in our documentation to get a feel for the resource consumption of Ceph.

daniel.h said:
+ NETWORKING
++ 1Gb Switch 20x RJ45 (Proxmox Control Network webinterface/ssh)
++ 1Gb Switch 20x RJ45 (Proxmox corosync 1)
++ 1Gb Switch 20x RJ45 (Proxmox corosync 2)
++ 10Gb Switch 20x RJ45 (Proxmox CEPH public network)
++ 10Gb Switch 20x RJ45 (Proxmox CEPH cluster network)

Additional networks you can configure is a network for migration and for backups. If the Proxmox VE nodes should host client and Ceph services, then I don't recommend to split the network [1]. Since only the replication traffic of the OSDs is running on the cluster network. Possibly a faster network may be needed for Ceph, see our benchmark paper [2].

daniel.h said:
The above link describes the construction to be capable of running the VM-images straight from CEPH-storage, what is your experience with that? From an admin's point of view this is very tempting for VM-migration and storage overview -- what about performance compared to placing the VMs onto, say, a separate (local) SSD-RAID (SATA or SAS, Nvme is out of financial reach [that's the use of 'unclustered' Proxmox-VEs we have most experience with and is very good])?

This depends also on the needed IO throughput. Storage that involves the network (aka. Ceph) will perform differently then eg. a local connected NVMe SSD. While local storage may be able to give you higher IO/s, it will also introduce a single point of failure. When the host is down, the services can't be recovered to other nodes.

To get some numbers on Ceph's performance [2], see our benchmark paper and the corresponding thread [3].

daniel.h said:
The other big question is VM-distribution/count. We currently look at two models - two separate big VMs (like 20 threads, 256GB RAM for jupyter and sandstorm each), a couple of small ones (1 thread, 512M RAM) for administrative services like LDAP. This could be the easiest way to maintain the system but brings some 'security' issues as we will at least partly have to expose a shell to offer a 'real experience'. The other way would be to provide a small (like 1 threads, 8GB RAM) VM to each user, which eliminates the one-person-can-freeze-the-system part, but creates quite some extra-work in the areas DNS- and authentication-management (also, more than 2 application nodes would then be necessary). What do you think?

DNS and authentication are two services that can be automated (eg. automatic register in DNS, LDAP connection with security lists).

But of course both approaches have some dis-/advantages. First off, you will need to think about the failure-domains of your setup. Do all VMs need to be under HA (high-availability)? Will the Ceph nodes ever host a VM/CT? Maybe create two instead of one cluster?

To give the above more context, if you choose to use HA. Proxmox VE uses corosync [4] to establish quorum throughout the cluster. On top of corosync runs the pmxcfs (/etc/pve/), every update (eg. VM config) needs to be transported by corosync to all cluster participants. And again a level higher sits the HA-stack (CRM & LRM) [5]. The HA-stack uses the /etc/pve/ha/ for its state and locks.

I hope this helps in your decision-making phase.

[0] https://pve.proxmox.com/pve-docs/chapter-pveceph.html#_precondition
[1] https://docs.ceph.com/docs/nautilus/rados/configuration/network-config-ref/
[2] https://proxmox.com/en/downloads/item/proxmox-ve-ceph-benchmark
[3] https://forum.proxmox.com/threads/proxmox-ve-ceph-benchmark-2018-02.41761/
[4] https://pve.proxmox.com/pve-docs/chapter-pvecm.html
[5] https://pve.proxmox.com/pve-docs/chapter-ha-manager.html

daniel.h · May 19, 2020

@Alwin: thanks for your input!

Alwin said:
Proxmox VE can also run nested and would give you the ability to dry-run your setup with virutal Proxmox VE nodes.

That is an interesting test, that will let us simulate authentication scenarios. We still have an older machine, that should be capable of keeping up to five of the mentioned small VMs (so we likely don't have to go the nested way). Pulling up CEPH and testing images will be pretty limitted though, as we only have 1Gb switches by hand.

Alwin said:
[... 16GB RAM at 4x2TB OSDs]
That is not enough memory, since OSDs will use 3-5 GB each. See our Ceph precondition [0] section in our documentation to get a feel for the resource consumption of Ceph.

Thanks for pointing out -- my calculation was based on the old filestore OSD-recommendations and not the new bluestore. I guess going up to 64GB RAM should be plenty for the storage nodes?

Alwin said:
Additional networks you can configure is a network for migration and for backups. If the Proxmox VE nodes should host client and Ceph services, then I don't recommend to split the network [1]. Since only the replication traffic of the OSDs is running on the cluster network. Possibly a faster network may be needed for Ceph, see our benchmark paper [2].

I'm not sure if we can afford additional networks -- but for VM usage accross the nodes: for now it's planned to keep the (conceptional) separation between 'application-' and 'storage-' nodes, so we're not planning to put VMs onto the storage-nodes. The thought is to have two 'types' of devices, to expand the cluster if needed. The expansion part is more in theory now, as we will be spending most of our budget in the initial phase of a five-year project running time.

Alwin said:
This depends also on the needed IO throughput. Storage that involves the network (aka. Ceph) will perform differently then eg. a local connected NVMe SSD. While local storage may be able to give you higher IO/s, it will also introduce a single point of failure. When the host is down, the services can't be recovered to other nodes.

Do you have any hints for estimating the 'needed IO throughput'? At the moment I'm doing tests on a VPS running sandstorm and jitsi -- the average IO/s on the VMs disk over the last month is 0.3 IO/s read and 3.28 IO/s write (with max values of 394 IO/s read and 655 IO/s write -- which likely was during upgrading the VMs OS). Is it a valid assumption, that under these cirumstances our CEPH would have to deliver 30 IO/s read and 328 IO/s write to provide a 'similar' performance for 100 'readily prepared' VMs?

Thinking about getting local storage on the application nodes, not as fully planned into the design, but as a fallback option if performance is not sufficient.

Alwin said:
[faster networks]
To get some numbers on Ceph's performance [2], see our benchmark paper and the corresponding thread [3].

We'd certainly like to go up to 25,40 or 100 Gb networking, but this raises the price bar more than just significantly and is most likely out of scope.

Alwin said:
DNS and authentication are two services that can be automated (eg. automatic register in DNS, LDAP connection with security lists). But of course both approaches have some dis-/advantages. First off, you will need to think about the failure-domains of your setup. Do all VMs need to be under HA (high-availability)? Will the Ceph nodes ever host a VM/CT? Maybe create two instead of one cluster?

Basically the VMs should all be under HA (be it the solution of two big VMs and some small administrative ones, or the granular solution with only small vms) - as this is the reason why we want to introduce CEPH

Best regards,
Daniel

Alwin · May 19, 2020

daniel.h said:
That is an interesting test, that will let us simulate authentication scenarios. We still have an older machine, that should be capable of keeping up to five of the mentioned small VMs (so we likely don't have to go the nested way). Pulling up CEPH and testing images will be pretty limitted though, as we only have 1Gb switches by hand.

Well, you can also run Ceph on linux bridges. The virtual test cluster doesn't need any physical ports, since you can generate your network with interface provided by linux (eg. dummy interface).

daniel.h said:
Thanks for pointing out -- my calculation was based on the old filestore OSD-recommendations and not the new bluestore. I guess going up to 64GB RAM should be plenty for the storage nodes?

Sizing is similar, just that filestore used the page cache (XFS) and bluestore has its own. Depends what else is running on those nodes, but 64 GB should give enough room for 4x 2TB SSDs and possible other Ceph services.

daniel.h said:
Basically the VMs should all be under HA (be it the solution of two big VMs and some small administrative ones, or the granular solution with only small vms) - as this is the reason why we want to introduce CEPH

Either way, I recommend to re-think the one big cluster vs two separate cluster approach. HA has some additional complexity and it will reset nodes, when quorum is lost.

daniel.h said:
We'd certainly like to go up to 25,40 or 100 Gb networking, but this raises the price bar more than just significantly and is most likely out of scope.

Skip 40 GbE, its 4x 10 GbE. 25 GbE will have better latency then 10 GbE and 100 GbE is 4x 25 GbE.

daniel.h said:
I'm not sure if we can afford additional networks -- but for VM usage accross the nodes: for now it's planned to keep the (conceptional) separation between 'application-' and 'storage-' nodes, so we're not planning to put VMs onto the storage-nodes. The thought is to have two 'types' of devices, to expand the cluster if needed. The expansion part is more in theory now, as we will be spending most of our budget in the initial phase of a five-year project running time.

Not only Ceph clients will need to access the public network. A great deal of performance gained/lost on the network.

daniel.h said:
Do you have any hints for estimating the 'needed IO throughput'? At the moment I'm doing tests on a VPS running sandstorm and jitsi -- the average IO/s on the VMs disk over the last month is 0.3 IO/s read and 3.28 IO/s write (with max values of 394 IO/s read and 655 IO/s write -- which likely was during upgrading the VMs OS). Is it a valid assumption, that under these cirumstances our CEPH would have to deliver 30 IO/s read and 328 IO/s write to provide a 'similar' performance for 100 'readily prepared' VMs?

While you can run a baseline tests to estimate the possible IO capacity of your current setup. The best measures will come from a long term performance monitoring. With that said, if you can estimate the max IO/s and bandwidth possible on your current setup, then you can size up the new cluster to similar performance.

daniel.h · May 20, 2020

@Alwin: thanks for your suggestions!

I changed the hardware specs in a try to integrate your thoughts. Please let me know, if I grasped it right!

The 'proxmox ceph cluster' and the 'proxmox application cluster' are now thought as separately networked entities, that talk to each other over SFP28/25Gb (possibly bonded).

Storage cluster node (7x):
+ single-socket CPU with at least 16 threads
+ at least 64GB RAM
+ 500GB SAS SSD RAID 1 (proxmox)
+ 4x 2TB SATA SSDs (OSDs)
+ Networking Components:
+ 3x 1Gb ports (proxmox control, corosync 1 & 2)
+ 4x SFP28/25Gb ports (ceph cluster & ceph public, possibly each bonded)

Application cluster node (2x):
+ 128 threads
+ 1 TB RAM
+ 500GB SAS SSD RAID 1 (proxmox)
+ Networking Components:
+ 4x 1Gb ports (proxmox control, corosync 1 & 2, UPSTREAM (+1?))
+ 2x SFP28/25Gb (ceph public, possibly bonded)

Network switches:
+ 1x SFP28/25Gb min 18 ports (ceph public, storage and application nodes meet here, filled up with this setup if using 2x 25Gb bonds)
+ 1x SFP28/25Gb min 18 ports (ceph cluster, storage nodes only, 4 ports left if using 2x 25Gb bonding)
+ 6x RJ-45/1Gb min 20 ports (3x for storage cluster, 3x for application cluster -- proxmox control, corosync 1 & 2 each -- port amount is overkill here, but from this size redundant power supplies seem likely to appear, while the switches still are affordable)

The one thing that bothers me now, is the obvious lack of expandability network-wise. If we want to add application nodes (to give the client's VMs eg 2 cores instead of one, we will have to go to the next level of 25G switches (if we'd want to go to 4x25Gb, we'd already need them now). For sure that's only valid if we bond the 25G NICs (otherwise we have plenty spare ports on the switches). Admittedly that problem would be similar in the initial 10G setup, but in the 25G class planning issues hurt the bill far more brutal.

Regarding the upstream connection for the VMs (within the VPN of our university) -- do you think it's technically useful bonding the 1Gb upstream connections through which all the VMs will talk to their clients (that's the ~UPSTREAM (+1?))~ note in specs above?

Best,
Daniel

Alwin · May 20, 2020

daniel.h said:
Application cluster node (2x):

You will need three for quorum.

daniel.h said:
The one thing that bothers me now, is the obvious lack of expandability network-wise. If we want to add application nodes (to give the client's VMs eg 2 cores instead of one, we will have to go to the next level of 25G switches (if we'd want to go to 4x25Gb, we'd already need them now). For sure that's only valid if we bond the 25G NICs (otherwise we have plenty spare ports on the switches). Admittedly that problem would be similar in the initial 10G setup, but in the 25G class planning issues hurt the bill far more brutal.

One option is to use 2x 25 GbE for the Ceph nodes in the beginning. On a switch failure 25 GbE alone should be enough to keep the service available till the broken switch can be replaced. Depending on the throughput / IOps those SATA SSDs can provide and also on the IOps of the workload, a bonded 25 GbE might already be sufficient.

daniel.h said:
+ 1x SFP28/25Gb min 18 ports (ceph public, storage and application nodes meet here, filled up with this setup if using 2x 25Gb bonds)
+ 1x SFP28/25Gb min 18 ports (ceph cluster, storage nodes only, 4 ports left if using 2x 25Gb bonding)

I suppose Ceph public & cluster networks are kept on the same link, since you want to sustain a switch failure. You could go with a combo switch that has 25 & 100 GbE ports or with 100 GbE solely. The 100 GbE ports can be split into 4x 25GbE as well.

Something along those lines.
https://store.mellanox.com/categories/switches/ethernet-switches/sn2000.html

daniel.h said:
The one thing that bothers me now, is the obvious lack of expandability network-wise. If we want to add application nodes (to give the client's VMs eg 2 cores instead of one, we will have to go to the next level of 25G switches (if we'd want to go to 4x25Gb, we'd already need them now).

Well, you will need three nodes anyhow. It also sounds like those nodes may host to many services, if 2 cores instead of 1 for VMs will result in additional nodes. If that is in some way the case, maybe you might consider the approach of more nodes with less resources per node. The load could be distributed more evenly and the resource (eg. bandwidth) requirement of one node will be lower. Also a node failure will result in less services being down and less load that needs to be distributed. The redundancy could go into the numbers of nodes (more the merrier).

daniel.h said:
Regarding the upstream connection for the VMs (within the VPN of our university) -- do you think it's technically useful bonding the 1Gb upstream connections through which all the VMs will talk to their clients (that's the ~UPSTREAM (+1?))~ note in specs above?

Depends if 2x1 GbE is enough bandwidth but LACP bonds should saturate the links evenly. My gut feeling would say, 2x10 GbE might be better, considering the amount of services that you seem to plan for on those nodes.

daniel.h · Jun 2, 2020

@Alwin: Thanks for your input and please excuse my late answer -- we had a lot (re)design discussions last week! We will be stepping back regarding the shared storage and the amount of services on the application nodes.

This means, that we will just be using the application nodes virtualized (with proxmox running on them, but 'unclustered' proxmox-wise). As we do have some extra storage devices at our facilitity we will not go the CEPH way and use a dedicated RAID10-NFS-server (SATA-SSDs) for shared (home-)directories across services instead. This one will be backed up onto a machine with the same setup (but with HDDs instead of SSDs) for redundancy. The NFS wil be connected to proxmox and the VMs via 2x10Gb (we decided to run the VM images not from the shared storage but rather from a local RAID1 on the application nodes).

So basically the trend is going from a hyperconverged-CEPH-setup towards a 'virtualization hybrid'-setup. Functionally speaking the proxmox-base on our application nodes allows us to remotely set up VMs, and the NFS allows us to keep the VM-images small, as user-data is mounted from there. Redundancy for the VMs is (manually) achieved through the second application node. The VM-count was lowered by far, as we will go towards partly rather big (eg. jupyterhub) VMs and a couple of smaller ones (apache, LDAP, mumble...).

As it would be interesting to use the NFS as space for isos and VM-snapshots -- this would likely mean, that we will have to mount the NFS into proxmox AND the VMs/services which require shared storage?

Best regards,
Daniel

Alwin · Jun 2, 2020

daniel.h said:
This means, that we will just be using the application nodes virtualized (with proxmox running on them, but 'unclustered' proxmox-wise).

I am not sure here. But are you aware that the Proxmox VE cluster has nothing to do with the Ceph cluster? I mean in terms of clustering.

daniel.h said:
As it would be interesting to use the NFS as space for isos and VM-snapshots -- this would likely mean, that we will have to mount the NFS into proxmox AND the VMs/services which require shared storage?

Yes, it will need to be mounted as storage. And the VM disk(s) will need to be located on the NFS storage. As snapshots are located on same storage as the disk.

daniel.h · Jun 2, 2020

Alwin said:
I am not sure here. But are you aware that the Proxmox VE cluster has nothing to do with the Ceph cluster? I mean in terms of clustering.

Yes, that's clear. I'm talking about using one proxmox-node as a means to provide our VMs, and the second one as a fallback device, if hardware fails badly on the first one. Redundancy would be given by having the actual VM-disks on both separately. Having two separate web-interfaces to control the VMs is OK to us.

In our actual setup, where we use several proxmox VE CE servers, we push full backups (no snapshots) of the VMs onto an NFS-server (not through a dedicated switch, like we are planning now). The VMs themselves are located on the server itself. This works all nicely, so we will stick with that kind of setup until we have a reasonable amount of machines, that will let us allow to go for the initial plan of full virtualization.

To explain how CEPH came onto the plan (and is fading for now): parallel to the mentioned machines we run an HPC setup, where a 6-node-CEPH-cluster provides shared storage for user homes and most programs to 6 computing nodes. It is fast and reliable, but on bare metal it is not nice to administer (that's were our tests with ceph on proxmox came into play). As storage demands for our new project are vastly lower (primary goal is to provide a programming/teaching platform for python and R, people are not supposed to use the setup as 'storage device'), we stepped away from CEPH. Also to just reduce the count of necessary machines, as we are on a budget.

Alwin said:
Yes, it will need to be mounted as storage. And the VM disk(s) will need to be located on the NFS storage. As snapshots are located on same storage as the disk.

That is, if we do not run the VM disks from the NFS, but from a local device on the server, we could place snapshots only onto the local storage, and pull full backups onto NFS (as we are doing now)? Do you know how to calculate snapshot space consumption (eg. a VM with a 30GB disk image, of which 20GB are used from within the VM)?

Assuming I have three shares on the nfs-server (say, /srv/isos /srv/users /srv/vm_backups) that are partly meant for proxmox (/srv/isos /srv/vm_backups) and partly for VMs/services that require user homes (/srv/users). Am I right that we will have to reserve 2 (possibly bonded) 10Gb connections to the switch that connects to the NFS-server (1 for proxmox, 1 for the VM(s) that shall have NFS-mounted user homes)?

Alwin · Jun 2, 2020

Thanks for sharing your thought process and current setup.

daniel.h said:
That is, if we do not run the VM disks from the NFS, but from a local device on the server, we could place snapshots only onto the local storage, and pull full backups onto NFS (as we are doing now)?

The snapshots are placed on the same storage (eg. LVM-thin, ZFS, ...). And yes, vzdump creates full backups.

daniel.h said:
Do you know how to calculate snapshot space consumption (eg. a VM with a 30GB disk image, of which 20GB are used from within the VM)?

I am not sure what you are asking me. If you mean how big can a snapshot grow, then it is always as big as the disk size of the VM.

daniel.h said:
Am I right that we will have to reserve 2 (possibly bonded) 10Gb connections to the switch that connects to the NFS-server (1 for proxmox, 1 for the VM(s) that shall have NFS-mounted user homes)?

The configuration will need a bridge in any case. If the bridge has an IP, then the node and the VMs can communicate with the NFS server.

daniel.h · Jun 2, 2020

Alwin said:
I am not sure what you are asking me. If you mean how big can a snapshot grow, then it is always as big as the disk size of the VM.

Thanks, you guessed my question right! It's important for me to estimate necessary local storage for the VMs & snapshots on the application-nodes (ie proxmox-servers).

Alwin said:
The configuration will need a bridge in any case. If the bridge has an IP, then the node and the VMs can communicate with the NFS server.

Currently our configuration uses separate IPs for all entities -- 1 for the node and 1 for each VM. Partly we have these VMs on the same (bridged) NIC as the proxmox-server itself (one bridge), partly we use two (bridged) NICs (one just for accessing proxmox internally, the other connected to DMZ for external access, just for the VM that needs this connection) (two bridges).

What we were thinking about in our new project was to physically separate proxmox-NFS-traffic and VM-NFS-traffic on server-NIC-level, but the more I think about it, the more it looks artificial. I guess a setup with 2 bridges on 2x10Gb NIC-ports equals 1 bridge on 4x10Gb-ports regarding speed and redundancy for the nfs-server-access (all physical 10Gb connections going into the same switch)? The latter would at least eliminate the need for the configuration of two bridges, that look into the same place anyway

As we have the same story 'upstream' (our connection to the institute's network), this could be a nice reduction of necessary configurations.

Alwin · Jun 2, 2020

KISS is always a good principle. But VLAN may be a good option to separate the traffic logically. This may allow to enforce access restrictions if needed.

PigLover · Jun 2, 2020

daniel.h said:
...We'd certainly like to go up to 25,40 or 100 Gb networking, but this raises the price bar more than just significantly and is most likely out of scope...

You should double check that assumption. Assuming you are buying new, current prices for 25GB NICs/Switches are only a small margin higher than 10gb. Of course this changes if you have significant installed base to leverage (existing switches or 10GB NIC inventory). But check current pricing - you might be surprised to find that 25Gbe may be in your budget.

daniel.h · Jun 3, 2020

PigLover said:
You should double check that assumption. Assuming you are buying new, current prices for 25GB NICs/Switches are only a small margin higher than 10gb. Of course this changes if you have significant installed base to leverage (existing switches or 10GB NIC inventory). But check current pricing - you might be surprised to find that 25Gbe may be in your budget.

Thanks for your hint. I've been checking prices -- you are right, that some of the 25Gb switches kind of would be in our range. The problem is (as we are spending public money) -- the affordable ones (at least here in Europe) only have a year warranty (Cisco and Mellanox!). The 10Gb ones go for about half/two thirds of the price of the cheaper 25Gb ones, but usually are covered by 10years. I just can't let this essential part of our setup let without warranty after the first year of a five-year project (that hopefully will keep developing after this span)... though it would definitely be tempting!

daniel.h · Jun 3, 2020

Alwin said:
KISS is always a good principle. But VLAN may be a good option to separate the traffic logically. This may allow to enforce access restrictions if needed.

All our planned switches do have full management and VLAN capabilities -- so we will be able to adapt this approach, as having the proxmox-control-interface in the same network as the services themselves gives me a mediocre gut feeling every now and then. Which raises a new question -- is there a KISS-version of integrating the proxmox-webinterface with fail2ban eg?

Alwin · Jun 3, 2020

daniel.h said:
Which raises a new question -- is there a KISS-version of integrating the proxmox-webinterface with fail2ban eg?

https://pve.proxmox.com/wiki/Fail2ban

daniel.h · Jun 3, 2020

Alwin said:
https://pve.proxmox.com/wiki/Fail2ban

Thanks, that's exactly what I was hoping for

Search

Search

Proxmox VE CEPH cluster build

daniel.h

Member

Alwin

Proxmox Retired Staff

daniel.h

Member

Alwin

Proxmox Retired Staff

daniel.h

Member

Alwin

Proxmox Retired Staff

daniel.h

Member

Alwin

Proxmox Retired Staff

daniel.h

Member

Alwin

Proxmox Retired Staff

daniel.h

Member

Alwin

Proxmox Retired Staff

PigLover

Renowned Member

daniel.h

Member

daniel.h

Member

Alwin

Proxmox Retired Staff

daniel.h

Member

We value your privacy