OCFS2 support

jlauro

Member
Feb 10, 2024
94
15
8
Any plans in the roadmap for official OCFS2 support? It sounds like at least some are using it.

Considering proxmox to migrate from VMWare vsphere with dedicated SANs. LVM on top of iSCSI for shared storage works ok, but does not support snapshots. I assume with OCFS2, you could relatively easily do snapshots with qcow2 format and have similar performance for highly available storage as LVM on top of iSCSI. Is there some other option that I should consider for HA clusters with shared SAN storage? I do prefer separate storage over HCI, and although Ceph might be interesting consideration for greenfield, it's not a practical option for most of our existing servers.
 
OCFS2 has been around for decades. I used it over a decade ago and it was cumbersome and tricky to setup with LVM cluster daemon and IIRC corsync/pacemaker/heartbeatd or whatever the technology was called back then. Oracle has not much interest in it anymore, they have ACFS which is superior in any way. There are currently no active webpages available from Oracle (the original creator and maintainer) about it. Development is still going on according to the mailing list, yet no idea what the state is and that may be the problem for Proxmox with supporting it.

In the end, you may try both OCFS2 and GFS2 for yourself, as was also suggested in the mentioned bugzilla bug report and see for yourself. The last time I tried (having the same situation like you do: FC SAN with thick LVM), both were so bad that they were not usable. kernel panics deep inside of the GFS2 and OCFS2 drivers and Proxmox will not go down that rabbit hole to fix things. It's sad but true that the Linux ecosystem does not provide a good and rock solid dedicated shared-storage cluster filesystem, there is however maybe the best distributed shared-storage cluster filesystem available ... CEPH.
iSCSI/FC SAN was a good fit for VMware, yet it is not for Proxmox VE, never has and probably never will. The cluster stack has been and will be optimized for distributed shared storage, because it's the much better, cheaper and newer technology all big players already committed to years ago.

There are however other solution from other vendors and opensource that want to brdige the GAP, e.g.
Having another storage controller inbetween FC-SAN and PVE that e.g. supports ZFS-over-iSCSI, yet there is no rock solid HA stack around it. I've still HA-ZFS on my TODO list, yet never had time to try it out.
 
The main problem is people bought something for VMware usage, and it is now pretty much dead-weight. So they want from Proxmox or any open-source community to fix their problem. Yes there are alternatives, Blockbridge and a few others, so you are not only left with CEPH and soon-to-be-EOLed Gluster.
 
The main problem is people bought something for VMware usage, and it is now pretty much dead-weight.
sorry ... little bit of rant ...

<rant> Why? They can still pay VMware / Broadcom. They bought knowinly into the vendor lock and thought everything will be fine ... and corporate greed was - once again - more powerful. Man, I hate those companies. And only _now_, customers want to invest in free software ... naturally, they do not want to change and expect others to fix their shortcomings because they bet on the wrong horse </rant>

Yes there are alternatives, Blockbridge and a few others, so you are not only left with CEPH and soon-to-be-EOLed Gluster.
Yes, of course, but you normally don't want to a new SAN. The hardware for ceph will be cheaper and easier to scale. That is one of the big advantages of this technology. Not to mention the speed increase of local disks. We benchmarked a FC-based NVMe SAN a couple of years ago and the latency to local NVMe is soooo much less than the FC-based access to the same disks.
 
sorry ... little bit of rant ...

<rant> Why? They can still pay VMware / Broadcom. They bought knowinly into the vendor lock and thought everything will be fine ... and corporate greed was - once again - more powerful. Man, I hate those companies. And only _now_, customers want to invest in free software ... naturally, they do not want to change and expect others to fix their shortcomings because they bet on the wrong horse </rant>


Yes, of course, but you normally don't want to a new SAN. The hardware for ceph will be cheaper and easier to scale. That is one of the big advantages of this technology. Not to mention the speed increase of local disks. We benchmarked a FC-based NVMe SAN a couple of years ago and the latency to local NVMe is soooo much less than the FC-based access to the same disks.

Vmware was a great bet for well over a couple of decades. In this industry that's a good run, and fairly rare for something to degenerate as quickly as Broadcom is causing it with triple to 10x price increases.

Of course local NVMe is going to be the fastest... and for things that can be clustered that way with distributed copies of data that works great. Not everything supports that (and those that do, the multiple storage costs adds up).

I disagree with the cheaper and easier to scale. Not all SANs are over priced. With CEPH you have to multiply the number of drives and also have to throw more network and switches at it to get near the same performance. It's easier to scale storage separate from compute, especially during hardware refreshes. If you are finding CEPH cheaper, you are probably looking at the wrong SANs.
 
OCFS2 has been around for decades. I used it over a decade ago and it was cumbersome and tricky to setup with LVM cluster daemon and IIRC corsync/pacemaker/heartbeatd or whatever the technology was called back then. Oracle has not much interest in it anymore, they have ACFS which is superior in any way. There are currently no active webpages available from Oracle (the original creator and maintainer) about it. Development is still going on according to the mailing list, yet no idea what the state is and that may be the problem for Proxmox with supporting it.

In the end, you may try both OCFS2 and GFS2 for yourself, as was also suggested in the mentioned bugzilla bug report and see for yourself. The last time I tried (having the same situation like you do: FC SAN with thick LVM), both were so bad that they were not usable. kernel panics deep inside of the GFS2 and OCFS2 drivers and Proxmox will not go down that rabbit hole to fix things. It's sad but true that the Linux ecosystem does not provide a good and rock solid dedicated shared-storage cluster filesystem, there is however maybe the best distributed shared-storage cluster filesystem available ... CEPH.
iSCSI/FC SAN was a good fit for VMware, yet it is not for Proxmox VE, never has and probably never will. The cluster stack has been and will be optimized for distributed shared storage, because it's the much better, cheaper and newer technology all big players already committed to years ago.

There are however other solution from other vendors and opensource that want to brdige the GAP, e.g.
Having another storage controller inbetween FC-SAN and PVE that e.g. supports ZFS-over-iSCSI, yet there is no rock solid HA stack around it. I've still HA-ZFS on my TODO list, yet never had time to try it out.
Sounds like you definite have issues with OCFS2 and GFS2 (in the not too distant past). Not sure, but are you saying SAN with thick LVM is at least stable (no snapshots outside of backups and other limitations)?

Not sure what you are looking for in terms of active webpages.
https://docs.oracle.com/en/operatin...terFileSystemVersion2inOracleLinux.html#ocfs2
and lots of other pages from Oracle and others.

I'm not familiar with ACFS, I'll have to look into that more.
 
I disagree with the cheaper and easier to scale. Not all SANs are over priced. With CEPH you have to multiply the number of drives and also have to throw more network and switches at it to get near the same performance. It's easier to scale storage separate from compute, especially during hardware refreshes. If you are finding CEPH cheaper, you are probably looking at the wrong SANs.
Could be that I'm biased here. I'm used to FC-SAN. Most customers I worked with have those and the costs for infrastructure and support are immense, it is also VERY VERY fast, so maybe you get what you pay for. Some system also run on "stupid block storage boxes" like Fujitsu Eternus, which is very cheap in comparison to the monsters from 3par or pure. For iSCSI, I would imagine that the network costs are similar, components are the same.

Not sure, but are you saying SAN with thick LVM is at least stable (no snapshots outside of backups and other limitations)?
Yes, we use it everywhere (also besides virtualization) and it works really, really great. Never had any problem with it in decades. Yes, it lacks snapshot and thin-provisioning, yet it is rock solid.

Not sure what you are looking for in terms of active webpages.
https://docs.oracle.com/en/operatin...terFileSystemVersion2inOracleLinux.html#ocfs2
and lots of other pages from Oracle and others.
Thank you, those links are not referenced on the kernel wiki page about ocfs2, just dead and archive.org links. I'm not sure how good the state of OCFS2 in the Ubuntu LTS kernel is, the documentation your referenced is just for UEK (Oracles unbreakable enterprise kernel).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!