Hi, I'm a Microsoft solutions engineer currently evaluating a migration path for several of our Windows Server clusters (based on Hyper-V and S2D) to Proxmox VE. While I'm experienced in the Microsoft ecosystem, Proxmox and Ceph are not my strong suit, so I'm seeking guidance and validation from those of you with deeper expertise in this area.
Hardware (4 nodes):
Network & QoS:
Ceph Configuration:
VM Configuration (Proxmox VE 8.4):
Best regards,
P
Hardware (4 nodes):
- Dell R750 servers
- 2 × Intel Xeon Gold 16c/32t CPUs per node
- 1024 GB RAM per node
- 16 × 7.68 TB NVMe SSDs (Dell P5500 RI), 100% SSD-based
- 4 × 25 GbE NICs per node (Mellanox ConnectX-5)
- Switching: 2 × Dell S5048F-ON
- 100 GbE interconnect between switches
- 25 GbE ports to nodes
- VLT (Virtual Link Trunking) enabled
- Cross-switch LACP bonding per node
Network & QoS:
- Bonding: 4 × 25 GbE on each node (bond0)
- Mode: balance-xor with xmit_hash_policy=layer3+4
- MTU 8000 across hosts, switches, bridges
- VLAN-aware configuration
- QoS via tc + fq_codel, with class-based shaping:
- Ceph RBD I/O: 50–60% (constant)
- Ceph Replication: 10–20% (post-backup/snapshot)
- VM LAN: 20–40% (application traffic)
- VM Migration: 1–5% (planned)
- Management: 1–2% (critical, low jitter)
Ceph Configuration:
- Version: Ceph Reef
- 16 OSDs per node = 64 total
- Pool: Single 3× replicated data pool
- PG count: 2048 (manually set, autoscaler off)
- CRUSH: simplified with a single rule
- Placement algorithm: straw2
- Object size: 4 MB (considering 16 MB)
- MTU: 8000
- Ceph traffic over bond0, separated with QoS
VM Configuration (Proxmox VE 8.4):
- VirtIO SCSI single
- iothread=1, discard=on
- cache=none
- AIO: io_uring
- CPU type: host
- QEMU guest agent: installed on all VMs
- Guest OS: Windows Server (SQL, RDS, AD, File)
- Handled via Veeam Backup & Replication (VBR)
- VMs use qemu-agent and VSS for application-consistent snapshots
- No PBS is planned at this stage
- CPU mitigations disabled (mitigations=off)
- Scheduler: none (for NVMe)
- All traffic flows through bond0 with VLAN & QOS class-based separation
- External qdevice configured for quorum and split-brain protection in 4-node corosync cluster
- Is a single pool (64 OSDs, 2048 PG) sufficient for mixed workloads, or should I split them?
- Would 16 MB object size improve performance for large VMs (esp. SQL/Windows)?
- How does Ceph Reef compare to S2D with RDMA (e.g., 4 SSDs per CSV) in latency?
- Which Ceph or RBD parameters should I tune for fsync-heavy workloads (SQL, AD)?
- Are RBD snapshots with qemu-agent + VSS safe for SQL Server consistency?
- Does xmit_hash_policy=layer3+4 and MTU 8000 help with replica distribution and avoiding bottlenecks?
- And most importantly: will my Windows Server VMs perform well enough in this setup? I'm genuinely concerned about the performance of my workloads – especially domain controllers, file servers, RDS and SQL – under Ceph compared to what I'm used to on S2D.
Best regards,
P