Glusterfs is still maintained. Please don't drop support!

Who cares if it's old, if it works (vs. it not working)?
I do. Because running software which doesn't get security updates belong in museums, not in production, even in homelabs.
2) The goal here is to demonstrate with data, from within a VM, that gluster is still viable from a technology standpoint.
Thins won't change anything for ProxmoxVE though. GlusterFS support is deprecated and wil be removed in qemu and thus also from ProxmoxVE due to the stalled development by Redhat. Benchmarks ( no matter on which version) won't change this but convincing qemu developers that glusterfs support should be kept. Ideally also volunteering for maintaining glusterfs and it's support in qemu
 
  • Like
Reactions: UdoB
Seemed to have been a time-specific bug, because I currently do have at least one server with a MegaRAID SAS which is conveniently still in the Linux kernel: https://github.com/torvalds/linux/blob/master/drivers/scsi/megaraid/megaraid_sas.h - it works on modern Ubuntu kernels, you seem to be pointing to a very time-specific bug in the kernel around 6.8 which seems to have been resolved regarding JBOD mode? I actually have multiple servers running with Proxmox that have some form of MegaRAID controller in them (not my choice, conversion from VMware garbage).
:shrug: dunno if it is time specific.

(I mean, one would only be able to say this, in retrospect. At the time, when it happened, there was little means to know whether it was going to be a (permanent) issue or one that would've been resolved as a function of time. Either way, the point still remains: at the time, I couldn't update on account of it.

And now, whilst I could upgrade to PVE8, but then you'll invariably get others that'll ask the natural question "why not just upgrade to PVE9 anyways?" (and the answer to that question is because PVE9 brings with it, other issues. (cf. e1000 NIC issue).)

Thus, if said e1000 and 9361-8i works in PVE7, why break it?
NVIDIA Ethernet is cheaper than NVIDIA InfiniBand fabric and NVIDIA is pretty much the most expensive solution out there today. Arista is cheaper and they're still not a 'cheap' option whilst Arista has even lower latency options. Talking datacenter networks here. We just purchased ~300 usable ports worth of NVIDIA 400G IB switches with optics - that's a $600k investment and we don't even have the annual management software license or the NIC-side (ConnectX 8) and NIC-side optics or cabling, all-in all, I'm estimating $1.2M over a 5 year period. There is no 400G Ethernet fabric that costs $4k/link, it's about half to a quarter of that cost depending on your switch gear. I think it's a waste of money, but the religion of IB is strong amongst some people.
Depends on the ethernet adapter.

Right now, you can buy the MCX515A-CCAT off of eBay for $125.92. Conversely, you can by the IB version (MCX555-ECAT) off of eBay for $119.95.

Nvidia (read: Mellanox - and yes, I still call it Mellanox) is expensive because Mellanox has pretty much always been expensive. And it's only recently that other vendors are starting to come out with their own line of products, but in many cases, Mellanox still takes the crown. Myrinet tried. OmniPath tried. Mellanox/IB won.

In terms of what your company purchased - again, it depends.

You can pick up a Edgecore DCS510 AS9716-32D 32-Port 400GbE Bare Metal Switch with ONIE - Part ID: 9716-32D-O-AC-F-US from Colfax Direct, for example, for $16560 which would work out to $517.50 per port. Conversely, you could pick up a Mellanox Quantum-2 MQM9790 64-port Non-blocking Unmanaged NDR 400Gb/s InfiniBand Switch - Part ID: MQM9790-NS2F also from Colfax Direct for $31125 which works out to $486.328125 per port.

Cabling varies depending on how far your runs are going to be and whether the ends are QSFP-DD and/or QSFP112.

Either case, as the data that I have presented shows, you can get IB stuff cheaper than ethernet. And that was still very much the case, back in like 2019 when I bought my switch, because I think it was an 18-port 100 GbE switch that cost almost as much as my 36-port Mellanox 100 Gbps IB switch. I looked at it because the ConnectX-4 cards were VPI cards, so that means I could set the port LINK_TYPE to either ETH or IB using mstconfig. So I could've gone either way, and IB was cheaper than ETH. (At least now, the ETH premium over IB isn't as outrageous as it used to be. It's only a 6.4% premium now. It used to be anywhere from 15-40% more for 100 GbE vs. 100 Gbps IB.

IB is great, if you know how to take advantage of it.

(I didn't buy 100 Gbps IB for HDD based storage. I bought it because the HPC apps that I was running at the time, was able to regularly hit 80-90 Gbps out of the 100 Gbps possible for RDMA/MPI.)

The ability for me to run offload storage traffic onto said 100 Gbps IB was really just a bonus at that point.

My point was you can't push 100Gbps from a single spinning disk that gives at best 1-10Mbps of throughput (if not reading from cache).
Two things:
1) I agree that you can't push 100 Gbps from a single spinning disk. You might be able to hit 1 Gbps for sequential writes, where the HDD cache might have limited use/benefit. But that, again, wasn't the point of having 100 Gbps IB neither. It was a fringe/leftover benefit from running HPC applications that uses IB/RDMA/MPI.

2) This has very little to do with the performance difference between ceph (~5% of a drive's capability) vs. gluster (~22% of a drive's capability).
 
Last edited:
I do. Because running software which doesn't get security updates belong in museums, not in production, even in homelabs.
Yes, and as CVE-2026-43134, CVE-2026-43284, and CVE-2026-43500 shows, updates are great. [/s]

(i.e. if you didn't update your kernels, then you wouldn't have given yourself these LPE exploits that you otherwise, previously, didn't have.

Same thing with CVE-2024-3094, where, again, if you didn't update, then you wouldn't have given yourself this backdoor.

Your statement argues that updated software is more secure and yet, these are just four of the more recent CVEs where the CVSS is 7.8, 7.8, 7.8, and 10.0 respectively.

If you didn't update, they you might not "invited" these issues into your production systems, homelab or otherwise.

Who knows how many more others there are where it was an update that gave or exposed the system to issues, where, if you didn't update, your system would've been fine.


Ideally also volunteering for maintaining glusterfs and it's support in qemu
I don't program, therefore; any programming that I would do, to try and help/contribute, would be done entirely via vibe coding. And we've already seen what that's done to Nvidia's own GPU drivers.

If anything, my vibe-coding to "help" maintain gluster and/or qemu is a sure-fire way to kill off any remnants of gluster and/or qemu in the same way that Nvidia's vibe-coding of their own GPU drivers is a sure-fire way to kill of their own drivers.

Perhaps this is the real why you suggested it (so that I would be sure to kill it off, for good) by vibe-coding it, quite literally, to its own death.

(So far, no one has been able to answer my question "if a program is stable, then why does it matter that there aren't as many commits happening?".)

If that's the metric that qemu is as their rationale for dropping support for gluster, then by that logic, bad code that constantly needs to get fixed would win this battle/race because the number of commits per month for crappy code would be astronomical because you're always trying to fix something that's fundamentally and critically broken.

But in terms of commits per month, it'd be a winner according to that metric, and if that's what is what qemu devs use to determine what will be supported and what won't, then bad code, by this commits per month metric, would get more adoption than good code that doesn't need perpetual fixes all the time, just to get it to work properly in the first place.

And if that is the logic/metric that they're using, maybe I should vibe code glusterfs back to being supported by qemu because I can commit each of the garbage output that AI generates and thus, inflate the number of commits per month with AI/vibe coded slop, just to send the number of commits per month through the roof.

Something tells me that I can probably automate that with n8n.

(SIdebar: responses, but still no technical discussion about the fact that gluster is 4.4x faster than ceph (in terms of % of drive capabilities used). Interesting.)