First time PVE setup

coreyclamp

New Member
Jul 2, 2020
1
0
1
47
I've gotten a lot of mileage out of my lab/home compute environment, but it's running on 10th gen Dell hardware and has finally come time to be put out to pasture. I caught a great deal on some used Dell 12th gen servers, so I went ahead and decided to refresh everything and move from vSphere to Proxmox in the process. After a few weeks of setting up and reinstalling several times, I think I may have it figured out. I do still have some confusion and would like to know what best practices would be in this setup before I start migrating VMs from my vSphere setup.

Server Hardware:
  • 3 x PowerEdge r620 (ProxMox nodes)
    • 2 x 146 GB SAS (proxmox)
    • 6 x 600 GB SAS (ceph OSD)
    • 4 x 1GbE
    • 2 x 10GbE
  • 1 x PowerEdge r720xd
    • FreeNAS - serves NFS, SMB, & iSCSI to both host and guests, is primarily used for IP cam footage, media streaming, & backups)
    • 4 x 1GbE
    • 4 x 10GbE
Network Config:
  • VLANs/Subnets:
    • vlan 10 - Proxmox management traffic (default gateway)
    • vlan 11 - Proxmox/corosync cluster traffic (non-routed subnet)
    • vlan 12 - ceph public (non-routed subnet)
    • vlan 13 - ceph cluster (non-routed subnet)
    • vlan 21 - NFS/SMB services
    • vlan 22 - iSCSI services
    • vlan xx - various subnets VM guests, placement may be on either bond/bridge depending on requirements
  • OVS Bonds:
    • All LACP layer 2 & 3
    • bond0 - 4 x 1GbE
    • bond1 - 2 x 10GbE
  • OVS Bridges
    • vmbr0 - bond0
    • vmbr1 - bond1
  • OVS IntPorts
    • prox_mgmt (vlan 10, vmbr0)
    • prox_cluster (vlan 11, vmbr0)
    • ceph_public (vlan 12 vmbr0)
    • ceph_cluster (vlan 13, vmbr1)
    • stor_nfs (vlan 21, vmbr1)
    • stor_iscsi (vlan 22, vmbr1)

All 1G connections to a Cisco catalyst, 10G links are to a MikroTik CRS317. Switches connected together with a dual 10G.

Ceph-mon and metadata daemons on on each node

Is there something else I should be doing in regards to the network setup, specifically with ceph? Will sharing the dual 10G bond with iscsi/nfs, and some VM traffic have much impact with the OSD replication? I don't plan on really taxing this setup, as the most I have is a home lab (AD domain, db clusters, and a few web/app servers etc.), there is a security cam DVR server that currently runs 16 cameras, which I plan on doubling - but with h.265, it's only going to be processing about .75-1 GB/hr each camera.


Is there anything I missed or that I should be doing different?
 
That should work, but IMHO I suggest you consider:

1. Take one of the 1G links out of the bond and create a separate, dedicated network for PVE (Corosync/Kronosnet) [0]
2. Six 10K SAS drives x 3 nodes can easily saturate a 10G network during recovery (and probably just a rebalance). During these events, you'll likely have issues with the iSCSI and NFS sessions as well, which creates more traffic. Also, layer 2+3 hashing will likely not balance CEPH, NFS, or iSCSI -- these sessions are all layer 3+4. As long as you are aware, you can deal with it. If you can't, break the 10G bond and run 1 x 10G for CEPH (public/cluster) and 1 x 10G external storage (iSCSI, NFS, etc.)

[0] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_cluster_network
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!