ocfs2 kernel bug

Rico29 · Jan 10, 2018

Hello
I'm running latest proxmox version ( 5.1-41) with up-to-date kernel (Linux virtm7 4.13.13-2-pve #1 SMP PVE 4.13.13-32 (Thu, 21 Dec 2017 09:02:14 +0100) x86_64 GNU/Linux)

I'm using an ocfs2 partition as shared storage between two nodes.
ocfs2 version is 1.8.4-4

Today one of my two nodes completely crashed because of a kernel bug which seems related to ocfs2 :

Jan 10 09:53:19 virtm7 kernel: [65954.745299] kernel BUG at fs/ocfs2/suballoc.c:2017!
Jan 10 09:53:19 virtm7 kernel: [65954.745338] invalid opcode: 0000 [#1] SMP
Jan 10 09:53:19 virtm7 kernel: [65954.745352] Modules linked in: tcp_diag inet_diag ip_set ip6table_filter ip6_tables ocfs2 quota_tree dm_round_robin ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm binfmt_misc iptable_filter ocfs2_nodemanager ocfs2_stackglue softdog nfnetlink_log nfnetlink ipmi_ssif intel_rapl sb_edac joydev input_leds x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc mgag200 aesni_intel aes_x86_64 hid_generic ttm usbkbd usbmouse drm_kms_helper crypto_simd glue_helper drm usbhid cryptd i2c_algo_bit snd_pcm intel_cstate 8021q hid fb_sys_fops syscopyarea sysfillrect garp mrp sysimgblt snd_timer snd video intel_rapl_perf soundcore dcdbas mei_me ipmi_si mei ipmi_devintf pcspkr ipmi_msghandler lpc_ich shpchp acpi_pad acpi_power_meter wmi mac_hid dm_multipath
Jan 10 09:53:19 virtm7 kernel: [65954.745577] scsi_dh_rdac scsi_dh_emc scsi_dh_alua vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc bonding ip_tables x_tables autofs4 ahci libahci bnx2x tg3 ptp mdio megaraid_sas pps_core libcrc32c
Jan 10 09:53:19 virtm7 kernel: [65954.745647] CPU: 24 PID: 40753 Comm: kworker/24:6 Not tainted 4.13.13-2-pve #1
Jan 10 09:53:19 virtm7 kernel: [65954.745667] Hardware name: Dell Inc. PowerEdge M620/0NJVT7, BIOS 2.4.3 07/02/2014
Jan 10 09:53:19 virtm7 kernel: [65954.745705] Workqueue: dio/dm-1 dio_aio_complete_work
Jan 10 09:53:19 virtm7 kernel: [65954.745720] task: ffff8e3b0b06c5c0 task.stack: ffffa1a64d044000
Jan 10 09:53:19 virtm7 kernel: [65954.745778] RIP: 0010

cfs2_claim_metadata+0x15d/0x170 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.745793] RSP: 0018:ffffa1a64d047898 EFLAGS: 00010297
Jan 10 09:53:19 virtm7 kernel: [65954.745807] RAX: 0000000000000004 RBX: 0000000000000001 RCX: ffffa1a64d047940
Jan 10 09:53:19 virtm7 kernel: [65954.745825] RDX: 0000000000000001 RSI: ffff8e3b05b41480 RDI: ffff8e3b05fa29c0
Jan 10 09:53:19 virtm7 kernel: [65954.745842] RBP: ffffa1a64d0478f8 R08: ffffa1a64d04793a R09: ffffa1a64d04793c
Jan 10 09:53:19 virtm7 kernel: [65954.745860] R10: ffff8e3b07b63bb8 R11: 00000000001bbe05 R12: ffff8e42f9436000
Jan 10 09:53:19 virtm7 kernel: [65954.745878] R13: 0000000000000000 R14: ffffffffc0a9f8a5 R15: ffffa1a64d047948
Jan 10 09:53:19 virtm7 kernel: [65954.745906] FS: 0000000000000000(0000) GS:ffff8e3b0fb00000(0000) knlGS:0000000000000000
Jan 10 09:53:19 virtm7 kernel: [65954.745928] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 10 09:53:19 virtm7 kernel: [65954.745942] CR2: 00007f5feead31c0 CR3: 000000080813c000 CR4: 00000000001426e0
Jan 10 09:53:19 virtm7 kernel: [65954.745960] Call Trace:
Jan 10 09:53:19 virtm7 kernel: [65954.745983] ocfs2_create_new_meta_bhs.isra.50+0x7e/0x420 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.746002] ? __kmalloc+0x15c/0x1e0
Jan 10 09:53:19 virtm7 kernel: [65954.746023] ocfs2_add_branch+0x1e5/0x820 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.746049] ? ocfs2_inode_cache_lock+0x12/0x20 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.746073] ocfs2_grow_tree+0x300/0x710 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.746099] ? ocfs2_metadata_cache_unlock+0x19/0x20 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.746126] ? ocfs2_buffer_cached.isra.6+0x8b/0x1a0 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.746151] ocfs2_split_and_insert+0x2b8/0x410 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.746176] ocfs2_split_extent+0x397/0x510 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.746199] ocfs2_change_extent_flag+0x208/0x3c0 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.746919] ocfs2_mark_extent_written+0x14b/0x1d0 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.747700] ocfs2_dio_end_io_write+0x4b5/0x690 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.748335] ? ocfs2_allocate_extend_trans+0x190/0x190 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.749042] ocfs2_dio_end_io+0x3e/0x70 [ocfs2]
Jan 10 09:53:19 virtm7 kernel: [65954.749770] dio_complete+0x7e/0x1a0
Jan 10 09:53:19 virtm7 kernel: [65954.750431] dio_aio_complete_work+0x19/0x20
Jan 10 09:53:19 virtm7 kernel: [65954.751087] process_one_work+0x1e9/0x410
Jan 10 09:53:19 virtm7 kernel: [65954.751859] worker_thread+0x4b/0x420
Jan 10 09:53:19 virtm7 kernel: [65954.752550] kthread+0x109/0x140
Jan 10 09:53:19 virtm7 kernel: [65954.753218] ? process_one_work+0x410/0x410
Jan 10 09:53:19 virtm7 kernel: [65954.753885] ? kthread_create_on_node+0x70/0x70
Jan 10 09:53:19 virtm7 kernel: [65954.754482] ret_from_fork+0x25/0x30
Jan 10 09:53:19 virtm7 kernel: [65954.755098] Code: 7d a8 49 89 d8 48 c7 c1 5e 03 aa c0 ba f7 07 00 00 48 c7 c6 00 ca a9 c0 4c 89 65 a8 e8 9d fd d6 ff 8b 45 a4 e9 6a ff ff ff 0f 0b <0f> 0b 0f 0b e8 8a cb c1 c8 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
Jan 10 09:53:19 virtm7 kernel: [65954.756423] RIP: ocfs2_claim_metadata+0x15d/0x170 [ocfs2] RSP: ffffa1a64d047898
Jan 10 09:53:19 virtm7 kernel: [65954.757094] ---[ end trace 03448bbc097254b4 ]---

Can anyone help me with that ?
Regards
Cédric

Rico29 · Jan 10, 2018

Sorry, my kernel was not completely up-to-date. Updated it to :
Linux virtm7 4.13.13-4-pve #1 SMP PVE 4.13.13-35 (Mon, 8 Jan 2018 10:26:58 +0100) x86_64 GNU/Linux

fabian · Jan 10, 2018

if you can reproduce this with either Ubuntu or Mainline kernels, I would report this upstream. we neither use, nor modify nor test OCFS in any way.

Rico29 · Jan 10, 2018

Hi Fabian,
I can not reproduce it, I just can wait for the next time it happens.
I'm running proxmox on a debian host (https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch)

LnxBil · Jan 11, 2018

I think OCFS2 is dead. The last post was about Oracle Linux 5.

Rico29 · Jan 11, 2018

LnxBil said:
I think OCFS2 is dead. The last post was about Oracle Linux 5.

No it's not, the mailing list and develeppers are still very active

LnxBil · Jan 11, 2018

Rico29 said:
develeppers are still very active

Unfotunately not the ones from Oracle, they're working on ACFS for years. But, you are right - at least the devel mailing list is active.
Maybe they can help with your problem, OCFS2 is not a frequent topic here and it it officially not supported.

aderumier · Jan 12, 2018

ocfs2 is not dead (a lot of commit on the dev mailing list), but it's full of bug. (I'm using in production in some vms, and since kernel > 3.16, I have a lot of random crash ).

for vm hosting, it's better to use a shared lvm.

LnxBil · Jan 12, 2018

I stopped using OCFS2 in 2012 due to a lot of bugs, yet I still loved the idea of it. We also switches to LVM back then. Although, we use ACFS heavily (not for VMs, yet for Oracle RAC), which is the new OCFS2 if you use Oracle products. A lot of the old ideas and a ton of new ones switched with the developers to ACFS.

Have you heard back from the devs about your problem? Could they help?

GiovanniP · Jan 19, 2018

Hello (again),
As i said in the other post "forum.proxmox.com/threads/patch-ocfs2-kernel-bug-at-fs-ocfs2-alloc-c-1515.39679/", i have found a possible patch to include in a google group post (groups.google.com/forum/#!searchin/linux.debian.bugs.dist/841144$20kernel$20bug|sort:date/linux.debian.bugs.dist/HtBS1AxLucM/3mQNA5UpAQAJ).
Btw, I updated this morning my pve kernel version from 4.13.13-2 to 4.13.13-5 and I retested the storage with 2 VM (on 2 different hosts) with Iometer software.
After the test I didn't seen the bug log in /var/log/syslog, but 1 vm is dead...

Jaakko · Jan 22, 2018

Hello

We have similar problems with our 8-node OCFS2 cluster used in our lab Proxmox VE. The latest kernel 4.13.13-5 made things a little bit more stable, but still our nodes crashed randomly under high IO load. We can have up to 50 our stresstest VM's and after a while (30min to 2hours) node or whole cluster is dead. Usually there is quick kernel oops message saying something about ocfs2_alloc or ocfs2_dlm. The problems started in beginning of December 2017 after update. Before that we could have up to 150 instances of these VMs easily.

Now for testing, i reformatted our ocfs2 volume using: mkfs.ocfs2 -T datafiles [path to the volume]. Previously i had used 1M for fs clustersize and 4k as block size, because we have lots of large files in the system.

Now i am doing stability test again with the same stresstest VMs and i can again have easily 150 - 200 instances up of these VM's, Without node or cluster crash. The stability test has been running now for over 3 hours.

Could there be some kind of memory or space allocation issue in ocfs2 when using large block or/and cluster size which causes kernel panic or oops?

If this thing is now stable, im fine with that. Otherwise i have to replace ocfs2 with something.

Jaakko · Jan 22, 2018

Hello again,

After five hours of testing, one node rebooted by unknown reason, probably because of out of memory. The cluster continued working fine and faulty node joined back into cluster without problems. No ocfs2 releated errors or warnings were found from /var/log/messages nor syslog.

I will still continue testing for a while.

GiovanniP · Jan 22, 2018

Hello jaakko,
I experienced the same behaviour, the ocfs2 crash after heavy VM IO load.
As i wrote in my previous message, someone has found the solution (as far as I understood due to an incorrect var calculation).
tha patch is released on this google group, but is has to be applied to the kernel source files.
Can the support team verify the patch and eventually release a new kernel version with the patch applied?

To reproduce the problem:
1. create a volume on my SAN and mapped to 2 hosts
2. configure the device in multipath mode
3. configure the ocfs on a separate subnet
4. format the device with mkfs.ocfs

Code:

mkfs.ocfs2 -b 4K -C 4K -J size=4M -N 16 -L {VOLUMENAME}  --cluster-name=ClusterX --cluster-stack=o2cb --force --global-heartbeat /dev/mapper/{DEVICENAME} --fs-feature-level=max-features

5. create 2 VM with qcow2 iso (one for each host)
6. run iometer simultaneously on both VM

PS. in the meanwhile i'm using LVM configuration...

Thanks
Giovanni

fabian · Jan 22, 2018

those patches are still under review upstream:
https://patchwork.kernel.org/patch/10150997/
https://patchwork.kernel.org/patch/10150993/

once they are merged, we can see whether they can be cherry-picked.

gurubert · Feb 24, 2018

fabian said:
once they are merged, we can see whether they can be cherry-picked.

Any progress on this issue?

fabian · Feb 26, 2018

not yet. the patches got applied in v4.16-rc1, but not yet to any -stable tree. I'll see whether they apply cleanly in the next round of kernel updates, if they do, the can probably go in.

LnxBil · Feb 27, 2018

Just today, I stumbled over the recent Oracle UEK Kernel changelog and they're indeed still applying patches to OCFS2 - so I was wrong in comments before about abandoning OCFS2.

Rico29 · Mar 14, 2018

Hello,
Any news on this topic ? Fabian ?
Regards

Gang He · Mar 14, 2018

I think you should look at the latest Linus git tree for ocfs2 patches.
e.g.
commit 71a36944042b7d9dd71f6a5d1c5ea1c2353b5d42
Author: Changwei Ge <ge.changwei@h3c.com>
Date: Wed Jan 31 16:15:06 2018 -0800

fabian · Mar 14, 2018

Rico29 said:
Hello,
Any news on this topic ? Fabian ?
Regards

the latest pve-kernels available on pve-no-subscription since today contain two ocfs2 cherry-picks. feedback on whether they solve your issue(s) would be appreciated!

ocfs2 kernel bug

Active Member

Active Member

Proxmox Staff Member

Active Member

Distinguished Member

Active Member

Distinguished Member

Renowned Member

Distinguished Member

New Member

New Member

New Member

New Member

Proxmox Staff Member

Famous Member

Proxmox Staff Member

Distinguished Member

Active Member

New Member

Proxmox Staff Member