Direct IO failed with Proxmox 8.2.2 and OCFS2 on MSA2052

devaux

Active Member
Feb 3, 2024
176
41
28
Hi there,
Getting lots of IO-issues with two new installed Proxmox 8.2.2 nodes with VMs on OCFS2:
Here the logs on the VM-Hosts:
Code:
[ 6567.401743] (kvm,2020,5):ocfs2_dio_end_io:2421 ERROR: Direct IO failed, bytes = -5
[ 6567.401756] (kvm,2020,5):ocfs2_dio_end_io:2421 ERROR: Direct IO failed, bytes = -5
[ 6567.401762] (kvm,2020,5):ocfs2_dio_end_io:2421 ERROR: Direct IO failed, bytes = -5
[ 6567.401769] (kvm,2020,5):ocfs2_dio_end_io:2421 ERROR: Direct IO failed, bytes = -5
[ 6567.401774] (kvm,2020,5):ocfs2_dio_end_io:2421 ERROR: Direct IO failed, bytes = -5
[ 6567.401780] (kvm,2020,5):ocfs2_dio_end_io:2421 ERROR: Direct IO failed, bytes = -5
[ 6567.401786] (kvm,2020,5):ocfs2_dio_end_io:2421 ERROR: Direct IO failed, bytes = -5
[ 6567.401792] (kvm,2020,5):ocfs2_dio_end_io:2421 ERROR: Direct IO failed, bytes = -5
[ 6567.401798] (kvm,2020,5):ocfs2_dio_end_io:2421 ERROR: Direct IO failed, bytes = -5
[ 6567.401805] (kvm,2020,5):ocfs2_dio_end_io:2421 ERROR: Direct IO failed, bytes = -5

- HP MSA 2052 connected with 8GBit FC
- if i set aio=threads in the disk-settings of the VMs everything is working as expected
- VMs are in QCOW format
- 6.8.4-3-pve
- Found other messages from people saying it started from PVE 8.1.4

Since it's a new install, should i go with these settings or will i encounter problems in the future? Speed?
Or are there better Cluster-File-Systems out there?
 
Sounds like you ran into this problem https://bugzilla.proxmox.com/show_bug.cgi?id=5430
The solution, for now, is to get and pin an older kernel.

You don't have much choice for CFS, so you might as well stick to OCFS

P.S. I don't believe PVE developers explicitly test OCFS, or any other freely available CFSs as part of their QA process. So your best bet is to have a Test environment and deploy there before upgrading production.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:
  • Like
Reactions: devaux
Sounds like you ran into this problem https://bugzilla.proxmox.com/show_bug.cgi?id=5430
The solution, for now, is to get and pin an older kernel.

You don't have much choice for CFS, so you might as well stick to OCFS

P.S. I don't believe PVE developers explicitly test OCFS, or any other freely available CFSs as part of their QA process. So your best bet is to have a Test environment and deploy there before upgrading production.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Exactly this!
Can you recommend a kernel version?
 
ok, no success with 6.5.13-5-pve. 6.2.16-20-pve looks promising so far.
Is there a lifecycle for security updates for each kernel from Proxmox?