OCFS2 as shared storage for a 3-node Proxmox VE cluster connected to an HPE MSA with FC HBA

kubaz

New Member
Jan 10, 2025
1
0
1
Hi all,

I recently set up OCFS2 as shared storage for a 3-node Proxmox VE cluster connected to an HPE MSA storage array. After encountering several issues I decided to seek some help from you guys. I have some problems with snapshots, migrations and windows VMs.

1.Prerequisites:
  1. Proxmox cluster created
  2. Multipath configured
  3. pve version 8.4.1
2.I've installed an OCFS2 packages on all nodes
Code:
apt install -y ocfs2-tools
modprobe ocfs2

3.Then I did an OCFS2 cluster config on all nodes
Code:
cat > /etc/ocfs2/cluster.conf << 'EOF'
cluster:
    name = pveocfs2
    heartbeat_mode = local
    node_count = 3

node:
    number = 0
    cluster = pveocfs2
    ip_port = 7777
    ip_address = 192.168.223.60
    name = pve1

node:
    number = 1
    cluster = pveocfs2
    ip_port = 7777
    ip_address = 192.168.223.30
    name = pve2

node:
    number = 2
    cluster = pveocfs2
    ip_port = 7777
    ip_address = 192.168.223.20
    name = pve3
EOF
I used a different cluster name then in my Proxmox cluster(maybe thats causing some errors)

4.Later on I configured O2CB on all nodes
Code:
cat > /etc/default/o2cb << 'EOF'
O2CB_ENABLED=true
O2CB_BOOTCLUSTER=pveocfs2
O2CB_HEARTBEAT_THRESHOLD=31
O2CB_IDLE_TIMEOUT_MS=30000
O2CB_KEEPALIVE_DELAY_MS=2000
O2CB_RECONNECT_DELAY_MS=2000
EOF

5. I've configured firewall for OCFS2 on all nodes
Code:
iptables -I INPUT -p tcp --dport 7777 -s 192.168.223.0/24 -j ACCEPT
iptables -I INPUT -p udp --dport 7777 -s 192.168.223.0/24 -j ACCEPT

6. I've started OCFS2 servieces(on the first node, the others)
Code:
systemctl enable o2cb ocfs2
systemctl start o2cb
service o2cb status

7.Then I've fotmatted a filesystem(on one node only)
Code:
mkfs.ocfs2 -L "vmstore" -N 3 --fs-feature-level=max-features /dev/mapper/mpatha

8.Mounting on all nodes
Code:
mkdir -p /mnt/vmstore
echo '/dev/mapper/mpatha /mnt/vmstore ocfs2 _netdev,defaults,noatime 0 0' >> /etc/fstab
mount -a

9.And then I've added a directory structure(on one node)
Code:
mkdir -p /mnt/vmstore/{images,template/iso,template/cache,dump,snippets}
chmod -R 755 /mnt/vmstore/

10.Lastly I've added a storage in Proxmox as a shared directory.

ISSUES:
-I ve primarly used a "Cluster-01" in the ocfs2 configs, changed that later on to "pveocfs2" but the pve cluster name remianed as was.(apparently ocfs2 doesnt like non-alphanumeric)
-Snapshots doesn't work on stared VMs(
Code:
TASK ERROR: VM 102 qmp command 'blockdev-snapshot-internal-sync' failed - Failed to create snapshot 'test' on device 'drive-scsi0': Input/output error
)
-Migration sometimes doesnt work:
Code:
2025-05-27 00:28:58 migration status error: failed - Error in migration completion: Input/output error
2025-05-27 00:28:58 ERROR: online migrate failure - aborting
2025-05-27 00:28:58 aborting phase 2 - cleanup resources
2025-05-27 00:28:58 migrate_cancel
2025-05-27 00:29:01 ERROR: migration finished with problems (duration 00:00:12)
TASK ERROR: migration problems

I hope you can help me with it in some way ;) Or encountered similar errors
 
The last time I tried this almost 10 years ago, we ran into the same problems and abandoned OCFS2. It's an unsupported solution and it sadly feels like it is not supported for a good reason. We went with FC-based SAN (3 different models over the years) until we used the FC-based SAN 12G SAS disk SSDs to build a ceph cluster.

Yes, it's not optimal to not have snapshot capability, but nowadays it's not that bad anymore with PBS, which is a great help with its online differential backups via hot block tracking. We implemented a ZFS-over-iSCSI host that ran on the FC-SAN and we online migrated the VMs to the storage, took snapshots and worked as we wanted and then online migrated back. This worked perfectly, but needed additional steps.
 
1. Has anyone managed to launch snapshots on an external disk array, whether using the OCFS or GFS2 file system shared by several nodes?

2. Is it possible to get a well-functioning shared file system (OCFS2 or GFS2) on an array for several nodes without snapshots?

3. In the option where snapshots are unavailable, is Proxmox Backup Server able to make a copy of the VM, say, every hour?
 
Last edited:
Has anyone managed to launch snapshots on an external disk array, whether using the OCFS or GFS2 file system shared by several nodes?
As I already wrote, everything worked technically sometimes, but it wasn't stable and therefore not production ready.

Is it possible to get a well-functioning shared file system (OCFS2 or GFS2) on an array for several nodes without snapshots?
distributed shared storage is easy with CEPH, dedicated shared storage is no option support, because all suck on way or the other. There are posts in the forum about working setups, please search for them, but they are outnumbered by ones with problems. I would not go into production with a setup that is not supported if you don't have staff at hand that can fix any upcoming problems.

Best is not to force your VMware hardware stack onto PVE and suspect that it would work the same, it's not. This applies to any hypervisor change.

In the option where snapshots are unavailable, is Proxmox Backup Server able to make a copy of the VM, say, every hour?
You mean a backup und yes, it works flawlessly (depending on the number of changes, network and storage speed, etc.).
 
Hello Kubaz,

The solution is to use the "aio=threads,iothread=1" option for hard drives.

We only use raw disk formats with OCFs2. We no longer use QCOW2 over OCFs2. The reason is that snapshots often produce errors, which may cause the disk to become corrupted. In the event of a network error/failure on the OCFs2 route, some VMs are often corrupted.

The only option is to use backups while the system is powered off.

Regard Kenneth Miller
 
  • Like
Reactions: PvEngel
The solution is to use the "aio=threads,iothread=1" option for hard drives.

We only use raw disk formats with OCFs2. We no longer use QCOW2 over OCFs2. The reason is that snapshots often produce errors, which may cause the disk to become corrupted. In the event of a network error/failure on the OCFs2 route, some VMs are often corrupted.

The only option is to use backups while the system is powered off.
So that that I understand it: so, you ended up with a feature set similar to LVM, but with more overhead and more problems?
 
  • Like
Reactions: Johannes S
Regarding ocfs2:
Unfortunately, there are customers who use it, and if their SAN is based on it and doesn't support ZFS, there's no other option.

Personally, I prefer ZFS, and then it has all the features. But that wasn't user Kubaz's question. He wanted a solution that would allow his VMs to boot on ocfs2.