Repurposing vxrails hardware with Ceph

Keeper of the Keys

Active Member
Jul 7, 2021
52
7
28
At a place I am working at I have been given access to a set of old vxrails machines to implement a Proxmox PoC (and I hope I'll be able to help them migrate from VMware to Proxmox in the future :)).

The machines are 3 Dell S570 (probably from 2018/2019) each with 4x4TB HDD and 2x400GB SSD (all with roughly ~25% wear on them).

As a first test I am hoping to set them up using Ceph and was wondering should I use 1 SSD exclusively for WAL and the other for DB for all the OSDs, or should I pair them in ZRAID1 or do some other layout I haven't even thought about?

Thanks!
 
Is there another boot drive?

IIRC WAL uses 10% of the OSD space by default though you can customize it. And DB defaults to using the WAL disk though that’s moveable also. Not sure why one would separate them but otherwise share the SSD. Do note that gives two points of failure for all. OSDs.

However Ceph says 4% or less:
Post in thread 'CEPH shared SSD for DB/WAL?'
https://forum.proxmox.com/threads/ceph-shared-ssd-for-db-wal.80165/post-397544

As a PoC maybe just use the (an) SSD to boot? Either way OSD on HDD isn’t going to be that fast but it will function.
 
Boot sits on ZRAID1 sata SSDs
Then for simplicity if you wanted to use them for DB/WAL I would use one SSD per two HDDs and not try to split the DB and WAL. You will need to reduce the size allocation, maybe 175 or 190 GB each, whatever fits.

Also what speed are the NICs in these servers? 1 Gbps will also be slow but again it will function.
 
It has 4 10GbE NICs (Intel X550), I believe all are connected to a switch at 10GbE but I don't have access to switch atm and enabling more than 1 port led to spanning tree issues so at least until the person in charge of all of that is a bit more available all 3 nodes are running on a single 10G connection.

I think eventually maybe 2 NICs for public and 2 for internal storage sync/heartbeat.

@gasherbrum assuming this is all the hw I currently can get what would you do? ZFS replication?
I could probably even get away with just using the SSDs since I doubt we'll be installing that many VMs/containers but that would miss the point of repurposing the old hw (this vxrail cluster was out of compliance so they had to disconnect it from the network and now instead it's a lab env that I hope may lead to more proxmox adoption going forward).
 
im pretty sure vxrails always had at least "cache" SSDs if there was some HDDs in the mix, so you might try to find who walked away with those. can always buy used SSDs if you want to PoC ceph

if you only have HDDs to work with i wouldnt use ZFS or ceph, just regular ext4/lvm
 
Is the PoC to show that it works, and/or practice, or that it's fast? :)

Our first PoC I set up PVE and Ceph on two retired desktop PCs. It will function, just not fast like I said. 10 Gbit is faster than the HDDs so don't worry much about networking.

We also repurposed some old hardware. Getting a bunch of used enterprise SSDs worked out well. It's a bit random of course but the ones we bought ended up being all 0-7% wear, as I recall. Our old caching SSDs are small but they can still function as an OSD, once their HDDs are replaced.

There are some caveats to only 3 nodes with Ceph, but it's a PoC.

find who walked away with those
I'm assuming that's the 400 GB drives?
 
a set of old vxrails machines to implement a Proxmox PoC
There is nothing "special" about the vxrail hardware; if the purpose of the exercise is to prove it "works" I can save you the trouble- it works.

The better question is, do you have a better description of the "concept" here? as others noted, the solution would be very slow, but thats only part of the downside- it will be slow and "weak" while using 10 year old technologies which means its performance/Watt will be quite poor, generate quite a bit of heat, and be noisy. If you dont pay for power and cooling that may not be an issue for you.
 
  • Like
Reactions: gurubert
I'm assuming that's the 400 GB drives?
Yes

This is the first foot in the door for Proxmox, the main goal at the moment is that I get a lab env with less of the encumberments of the way they currently mostly work, as an aside plus it may show some of the good sides of Proxmox.

The outfit is actually a mostly Debian outfit (just everything NFS boot with a single image that is adjusted for the different need which is also what makes some running experiments more complicated) so in theory Proxmox is a much better fit than VMware but there is a lot of inertia.
 
The outfit is actually a mostly Debian outfit (just everything NFS boot with a single image that is adjusted for the different need which is also what makes some running experiments more complicated) so in theory Proxmox is a much better fit than VMware but there is a lot of inertia.
Ahh makes sense. will the nfs boot apply to the workloads deployed on this hypervisor? if so, dont bother with ceph at all at this stage, since you already have storage. Your hardware is perfectly adequate for workload performance but ceph on hard drives will give everyone a bad taste, which is really not a fair assessment of "Proxmox" in this context- just proof that the hardware isnt adequate.

and if you're a Debian shop, you're already well suited to operate PVE in production, since your staff already understands most of what it takes to operate a Linux server in production. PVE is just Debian after all.
 
NFS boot will probably apply to some guests but mostly it will be a selfcontained playground where we can break things and try things faster than elsewhere.
 
ok, in that case you need to pay special attention to your network design.

You have, at MINIMUM, the following disparate network functions:
1. corosync
2. ceph public
3. ceph private
4. NFS payload
5. Internet/service network
6. BMC

comingling any combination of physical interfaces for 1-4 can have disastrous consequences. While mixing ceph public and private is pretty normal since all that does is slow down your disk performance, clobbering corosync, ceph, OR NFS in your case would result in loss of service and/or crash the whole cluster. I dont know how many (and how fast) NICs are present in your nodes, and how much uplink bandwidth is available from your vrtx switch to your nfs server(s) but I presume its not going to suffice for proper network insolation- you will want to put some QOS rules on the disparate vlans.
 
Been migrating 13th-gen Dells (R630s & R730s) from VMware to Proxmox Ceph. Clusters range from 3- to 9-nodes, all odd number for quorum.

All hardware is the same (RAM, CPU, NIC, storage, storage controller, firmware). Swapped out PERCs for Dell HBA330s since I don't want to deal with PERC HBA-mode drama.

Networking is 10GbE using Intel X550s on isolated switches for Corosync and Ceph traffic.

All workloads (from DB to DHCP servers.) backed up to a bare-metal Dell R530 PBS using ZFS also using a Dell HBA330.

Servers never had SSDs just SAS drives. Not hurting for IOPS. Proxmox is mirrored using two small SAS drives using ZFS RAID-1. Rest of drives are OSDs for Ceph.

I use the following optimizations learned through trial-and-error. YMMV.

Code:
    Set SAS HDD Write Cache Enable (WCE) (sdparm -s WCE=1 -S /dev/sd[x])
    Set VM Disk Cache to None if clustered, Writeback if standalone
    Set VM Disk controller to VirtIO-Single SCSI controller and enable IO Thread & Discard option
    Set VM CPU Type for Linux to 'Host'
    Set VM CPU Type for Windows to 'x86-64-v2-AES' on older CPUs/'x86-64-v3' on newer CPUs/'nested-virt' on Proxmox 9.1
    Set VM CPU NUMA
    Set VM Networking VirtIO Multiqueue to 1
    Set VM Qemu-Guest-Agent software installed and VirtIO drivers on Windows
    Set VM IO Scheduler to none/noop on Linux
    Set Ceph RBD pool to use 'krbd' option
 
Last edited:
Thanks for all the advise, will be playing with stuff and trying things :)

Out of curiosity, while I know HDDs are not great these are SAS disks which should provide better performance and all in all this system when it was running vxrail was supposedly performant enough, looking at the hw I get the impression that vxrail probably uses a lot of similar ideas to ceph which made me think I should also be to get acceptable performance with the HDDs + SSD cache.
 
Thanks for all the advise, will be playing with stuff and trying things :)

Out of curiosity, while I know HDDs are not great these are SAS disks which should provide better performance and all in all this system when it was running vxrail was supposedly performant enough, looking at the hw I get the impression that vxrail probably uses a lot of similar ideas to ceph which made me think I should also be to get acceptable performance with the HDDs + SSD cache.
vxrails were always sold in either all flash or hybrid (SSD and HDD) storage config, so yes it would have been performant enough in that config, however you dont have the SSDs so it will not be performant.