HA with Ceph and separate Ceph cluster

spdflyer

New Member
Apr 19, 2019
1
0
1
41
Hello,

I'm curious to get feedback if this setup makes since or I'm looking at things from a bad angle.

I currently have a Ceph cluster setup with 4 servers 11x4TB HDD's 10g for pub/priv.

I'm thinking of setting up 3* proxmox servers with 1HDD for OS and 2 SSD to act as OSD's for a proxmox only HA ceph cluster. I'm thinking the new servers would need 3 10G connections, 2 for proxmox ceph pub/priv and 1 connecting to the current cluster? I hope that makes since.

If I understand though, Proxmox creates it's own virtual pub/priv networks, so I don't need to configure the vlan on the actual switch?

Should I do it that way or does it makes better since to just add 2 SSD's to each of my current cluster servers and setup separate pools? 1 VM's on SSD and 1 for storage on HDD? Then have 1 10G connection going from the proxmox boxes.

*Adding a 4th later for redundancy.
 
Hi Spdflyer,

It sounds like you have a good grip on how all of this works.

We run a 3 node Ceph / Proxmox cluster with everything on those three nodes ... Ceph storage Proxmox, etc. This works well for us now and minimized our startup costs. We're considering adding SSDs for Windows server VMs but the many Debian VMs, CentOS VMs and 1 PFsense BSD VM work just fine on standard HDs.

But this is us ... our VMs hardly touch the disks once they boot. If you're expecting a lot of disk IO, have less powerful servers, etc. you'll want a different setup. If we were able I would have setup one set of servers for Ceph and another for Proxmox HA. It sounds like you're suggesting this.

Be sure that you have primary and secondary corosync networks with nothing else on them. Well, perhaps management traffic (ssh / https) on one and the other nothing but corosync traffic. Gig links are fine for this ... use a separate switch and nic for each. As much as possible hardware, cable, power failure shouldn't be able to knock both of these networks out. Read up on watchdogs, set them up and test them.

What do with all of the Ceph data traffic to the HA cluster? Use two 10G switches and link each server to both switches with separate NICs. Again, set it up so that no one piece of failed hardware can stop communications. We're using Debian's built in primary / backup NIC bonding to do this. It works flawlessly. We don't have any issues running our Ceph to VM trafic and Ceph sync traffic on the same 10G link but you can separate them.

We also connect the VMs to their clients (on the Internet or locally) via a 10G network that's setup the same as described above. There are many VLANs on this LAN. One for the Internet entering out PFsense firewall, another for the Internet exiting, another for the 192.128.x.y subnets that client VMs use as their LAN, etc. You can make this as complicated or simple as your needs demand. I strongly recommend the redundant config that I mentioned above. Without this why run an HA cluster?

When you're done, do some testing. We've pulled the power on our primary switches and watched the backups work flawlessly ... after some adjustments.

Hope that this helps or at least gives you more questions.

James
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!