Pre-install questions Proxmox/CEPH

NewDude

Member
Feb 24, 2018
61
6
13
I ran a Proxmox cluster for a while a couple of years back and was reasonably happy running on ZFS as local storage for VMs, but wanted native support for Veeam so I moved to Hyper-V. I'm not nearly as happy with Hyper-V, and am at the point where I'll either install reinstall my VMs on bare hardware, or move back to Proxmox. I'm liking what I'm seeing with CEPH nowadays, and so I thought I'd reach out and see what y'all think.

My overall goal is to minimize downtime for the primary server I host, which is an online forum (recently migrated from Hyper-V to bare metal). Right now we're running at about 75% of our normal load, and I'm seeing about 125 transactions per second on that server. Other VMs I'd like to run are one that just hosts a few low demand Wordpress sites, a 2016 Server VM that really just handles authentication, and a Linux VM to perform monitoring of the VMs. Basically, I don't need a ton of IOPS from the CEPH cluster, and I'm really more interested in failover should I be out of town and hardware fails.

My cluster right now is:
  • 3 Dell T30's with low-end Xeon processors, 64G RAM in each, 2 10G network ports in each
  • 1 Dell T30 with a non-Xeon processor and one 10G network port, also with 64G RAM
  • Redundant 10G switches
  • A single gigabit switch.
What I'm thinking about doing is building the T30's as Proxmox hosts with a ZFS mirror just for Proxmox (2 hard drives only - no log device), with another 2TB SSD that will be used to create the CEPH pool later (I'm sorry if I'm using the wrong terminology here.) I've read that with a 3-host cluster CEPH can struggle after one host fails, so I'm thinking about adding the 4th T30 to the cluster as well, also participating in the CEPH pool, so if one host fails performance will still be fine and I won't have an emergency waiting to get the host back up.

So, my questions:
  • I know my hardware is far from powerful, but I believe it's sufficient. Or at least, it's proven sufficient (overpowered) so far using Hyper-V with iSCSI mounts. Is there any reason to think the CEPH/Proxmox combo won't offer comparable performance? Like I said - low demand on the storage subsystem most of the time - right now (again, 75% of my normal users on) I'm seeing less than 30 IOPS on the iSCSI device (peaked at 62/s in the last 10 minutes) for everything other than my forum, which is only showing an additional 125 tps right now.
  • Are datacenter SSDs a necessity for my use case, or can I get by with something cheaper?
  • What's the advisability of just using the 10G network for all communications - Proxmox-to-Proxmox communication, storage, and a VLAN for the DMZ machines? Does using the 10G network for just storage make the most sense? (I ask because if I can do it all on the 10G network I can essentially remove the gigabit switch as a single point of failure should I choose to add a second firewall later on, and firmware updates on the switches will be easier as the active/passive 10G links should keep everything running through switch reboots...)
  • Is there going to be a problem mixing CPU types in the cluster? That non-Xeon is there not as a migration target but as an additional CEPH host, and a host to restart a VM on if one of the other machines should fail. I don't see an issue, but I don't know if there are issues I'm not considering here.
  • Is a 2 hard drive ZFS mirror going to be fine for the Proxmox host machines themselves? I'd think this would be fine, but again - y'all are the experts here.
That's all I've got for now. Thanks. :)
 
Last edited:
1. For ceph SSD is though recommended, you can still run on Enterprise SAS disks but you need to tune ceph parameters, else you keep getting slow ops warning
2. If you have 10G interfaces, use that for exclusive Ceph network
3. There won't be issue mixing CPU's as long as you use KVM(x86) in the VM configuration
 
  • Like
Reactions: NewDude
To get a feel for Ceph performance, check out our Benchmark Paper [0] and the corresponding forum thread [1]. Please, see also the precondition [3] section in our docs.

But from what I am reading, I would more recommend the storage replication [3]. And a separate database cluster, since the failover is a level higher. Also, the loss of transactions is minimized, in the event of a hardware failure.

[0] https://proxmox.com/en/downloads/item/proxmox-ve-ceph-benchmark
[1] https://forum.proxmox.com/threads/proxmox-ve-ceph-benchmark-2018-02.41761/
[2] https://pve.proxmox.com/pve-docs/chapter-pveceph.html
[3] https://pve.proxmox.com/pve-docs/chapter-pvesr.html
 
  • Like
Reactions: NewDude
But from what I am reading, I would more recommend the storage replication [3]. And a separate database cluster, since the failover is a level higher. Also, the loss of transactions is minimized, in the event of a hardware failure.

Thanks, Alwin. After reading the thread where people document their Ceph performance, I'm not sure I'd be happy with that solution. I can see backups really slowing things down if I were to go that route.

So, with storage replication: can it failover automatically? Searching shows some threads where it looks like after the first PVE host fails manual intervention is required to get the affected VM to restart on the node where data has been replicated to. Is it possible to automate this or is this really just a way to speed recovery should a host fail?
 
So, with storage replication: can it failover automatically? Searching shows some threads where it looks like after the first PVE host fails manual intervention is required to get the affected VM to restart on the node where data has been replicated to. Is it possible to automate this or is this really just a way to speed recovery should a host fail?
Yes, otherwise it's not HA. ;) See our docs for more.
https://pve.proxmox.com/pve-docs/chapter-pvesr.html
 
  • Like
Reactions: NewDude
Awesome. I don't see that stated in the docs, but now that you've made it clear I don't need to worry about that. :)

Part of the goal with my move is to do away with a shared storage backend that I no longer trust. I think this probably gets me there, and will provide the peace of mind to let me do things like go on vacation, knowing if a host dies the Vm(s) will keep running. :)
 
Part of the goal with my move is to do away with a shared storage backend that I no longer trust. I think this probably gets me there, and will provide the peace of mind to let me do things like go on vacation, knowing if a host dies the Vm(s) will keep running. :)
Just to make sure, the VM/CT will be started on the target node, once the current host of the service fails.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!