Gfs2 with kernel 2.6.32-29-pve crash

operador

Renowned Member
Oct 6, 2011
3
0
66
I have an intel modular server and when i use the new kernel the gfs2 file system crash and the node go out from the cluster.

That's my log of the crash
May 5 08:56:18 c1blade1 kernel: GFS2: fsid=BladeRunnerDatos.0: fatal: filesystem consistency error
May 5 08:56:18 c1blade1 kernel: GFS2: fsid=BladeRunnerDatos.0: inode = 1136 460000
May 5 08:56:18 c1blade1 kernel: GFS2: fsid=BladeRunnerDatos.0: function = gfs2_rmdir, file = fs/gfs2/ops_inode.c, line = 489
May 5 08:56:18 c1blade1 kernel: GFS2: fsid=BladeRunnerDatos.0: about to withdraw this file system
May 5 08:56:19 c1blade1 kernel: GFS2: fsid=BladeRunnerDatos.0: telling LM to unmount
May 5 08:56:19 c1blade1 kernel: GFS2: fsid=BladeRunnerDatos.0: withdrawn
May 5 08:56:19 c1blade1 kernel: Pid: 134904, comm: rm veid: 0 Not tainted 2.6.32-29-pve #1
May 5 08:56:19 c1blade1 kernel: Call Trace:
May 5 08:56:19 c1blade1 kernel: [<ffffffffa076a45c>] ? gfs2_lm_withdraw+0x11c/0x150 [gfs2]
May 5 08:56:19 c1blade1 kernel: [<ffffffffa076a66d>] ? gfs2_consist_inode_i+0x5d/0x60 [gfs2]
May 5 08:56:19 c1blade1 kernel: [<ffffffffa075c3ca>] ? gfs2_rmdir+0x27a/0x2a0 [gfs2]
May 5 08:56:19 c1blade1 kernel: [<ffffffffa074a5c3>] ? gfs2_holder_uninit+0x23/0x40 [gfs2]
May 5 08:56:19 c1blade1 kernel: [<ffffffffa075c229>] ? gfs2_rmdir+0xd9/0x2a0 [gfs2]
May 5 08:56:19 c1blade1 kernel: [<ffffffffa075c24b>] ? gfs2_rmdir+0xfb/0x2a0 [gfs2]
May 5 08:56:19 c1blade1 kernel: [<ffffffffa075c281>] ? gfs2_rmdir+0x131/0x2a0 [gfs2]
May 5 08:56:19 c1blade1 kernel: [<ffffffff811bb8a1>] ? vfs_rmdir+0xb1/0xe0
May 5 08:56:19 c1blade1 kernel: [<ffffffff811bacaa>] ? lookup_hash+0x3a/0x50
May 5 08:56:19 c1blade1 kernel: [<ffffffff811bea04>] ? do_rmdir+0x184/0x1f0
May 5 08:56:19 c1blade1 kernel: [<ffffffff811f38fe>] ? dnotify_flush+0x7e/0x140
May 5 08:56:19 c1blade1 kernel: [<ffffffff811a8abd>] ? filp_close+0x5d/0x90
May 5 08:56:19 c1blade1 kernel: [<ffffffff811beaa5>] ? sys_unlinkat+0x35/0x40
May 5 08:56:19 c1blade1 kernel: [<ffffffff8100b102>] ? system_call_fastpath+0x16/0x1b
May 5 08:56:19 c1blade1 kernel: no_formal_ino = 1136
May 5 08:56:19 c1blade1 kernel: no_addr = 460000
May 5 08:56:19 c1blade1 kernel: i_size = 3864
May 5 08:56:19 c1blade1 kernel: blocks = 2
May 5 08:56:19 c1blade1 kernel: i_goal = 460000
May 5 08:56:19 c1blade1 kernel: i_diskflags = 0x00000000
May 5 08:56:19 c1blade1 kernel: i_height = 0
May 5 08:56:19 c1blade1 kernel: i_depth = 0
May 5 08:56:19 c1blade1 kernel: i_entries = 1
May 5 08:56:19 c1blade1 kernel: i_eattr = 0
May 5 08:58:36 c1blade1 kernel: vmbr702: port 3(tap1003i0) entering disabled state
May 5 08:58:36 c1blade1 kernel: vmbr702: port 3(tap1003i0) entering disabled state
May 5 08:58:36 c1blade1 kernel: vmbr706: port 5(tap1003i1) entering disabled state
May 5 08:58:36 c1blade1 kernel: vmbr706: port 5(tap1003i1) entering disabled state
May 5 08:58:36 c1blade1 kernel: vmbr710: port 3(tap1003i2) entering disabled state
May 5 08:58:36 c1blade1 kernel: vmbr710: port 3(tap1003i2) entering disabled state
May 5 08:59:04 c1blade1 kernel: INFO: task gfs2_logd:127640 blocked for more than 120 seconds.
May 5 08:59:04 c1blade1 kernel: Not tainted 2.6.32-29-pve #1
May 5 08:59:04 c1blade1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 5 08:59:04 c1blade1 kernel: gfs2_logd D ffff880a13a6b370 0 127640 2 0 0x00000000
May 5 08:59:04 c1blade1 kernel: ffff880a27f8fd00 0000000000000046 0000000000000000 0000000000000000
May 5 08:59:04 c1blade1 kernel: 000000000002613b 0000000000057754 0000000000000002 ffff880a13a6b370
May 5 08:59:04 c1blade1 kernel: 0000000000000000 0000000104527e0d ffff880a13a6b938 000000000001ec80
May 5 08:59:04 c1blade1 kernel: Call Trace:
May 5 08:59:04 c1blade1 kernel: [<ffffffff8155ea55>] rwsem_down_failed_common+0x95/0x1d0
May 5 08:59:04 c1blade1 kernel: [<ffffffff8155bd80>] ? thread_return+0xbe/0x89e
May 5 08:59:04 c1blade1 kernel: [<ffffffff8155ebb3>] rwsem_down_write_failed+0x23/0x30
May 5 08:59:04 c1blade1 kernel: [<ffffffff81298303>] call_rwsem_down_write_failed+0x13/0x20
May 5 08:59:04 c1blade1 kernel: [<ffffffff8155e0a2>] ? down_write+0x32/0x40
May 5 08:59:04 c1blade1 kernel: [<ffffffffa0751f7f>] gfs2_log_flush+0x2f/0x600 [gfs2]
May 5 08:59:04 c1blade1 kernel: [<ffffffffa0750fdf>] ? gfs2_ail1_empty+0x2f/0x1b0 [gfs2]
May 5 08:59:04 c1blade1 kernel: [<ffffffffa0752629>] gfs2_logd+0xd9/0x140 [gfs2]
May 5 08:59:04 c1blade1 kernel: [<ffffffffa0752550>] ? gfs2_logd+0x0/0x140 [gfs2]
May 5 08:59:04 c1blade1 kernel: [<ffffffff810a2106>] kthread+0x96/0xa0
May 5 08:59:04 c1blade1 kernel: [<ffffffff8100c34a>] child_rip+0xa/0x20
May 5 08:59:04 c1blade1 kernel: [<ffffffff810a2070>] ? kthread+0x0/0xa0
May 5 08:59:04 c1blade1 kernel: [<ffffffff8100c340>] ? child_rip+0x0/0x20
May 5 08:59:04 c1blade1 kernel: INFO: task vgs:134928 blocked for more than 120 seconds.
May 5 08:59:04 c1blade1 kernel: Not tainted 2.6.32-29-pve #1
May 5 08:59:04 c1blade1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 5 08:59:04 c1blade1 kernel: vgs D ffff880a06d67230 0 134928 9473 0 0x00000000
May 5 08:59:04 c1blade1 kernel: ffff88083a2df918 0000000000000082 ffff88083a2df8d8 ffffffff8144627c
May 5 08:59:04 c1blade1 kernel: ffff88083a2df904 ffff880100010000 ffff88083a2df8a8 000000002f6eb752
May 5 08:59:04 c1blade1 kernel: ffff880657d8e5f8 0000000000000286 ffff880a06d677f8 000000000001ec80
May 5 08:59:04 c1blade1 kernel: Call Trace:
May 5 08:59:04 c1blade1 kernel: [<ffffffff8144627c>] ? dm_table_unplug_all+0x5c/0x100
May 5 08:59:04 c1blade1 kernel: [<ffffffff8155c5d3>] io_schedule+0x73/0xc0
May 5 08:59:04 c1blade1 kernel: [<ffffffff811eef63>] __blockdev_direct_IO_newtrunc+0x1613/0x1af0
May 5 08:59:04 c1blade1 kernel: [<ffffffff81293cfc>] ? put_dec+0x10c/0x110
May 5 08:59:04 c1blade1 kernel: [<ffffffff811ea9f0>] ? blkdev_get_block+0x0/0x20
May 5 08:59:04 c1blade1 kernel: [<ffffffff811ef4b7>] __blockdev_direct_IO+0x77/0xe0
May 5 08:59:04 c1blade1 kernel: [<ffffffff811ea9f0>] ? blkdev_get_block+0x0/0x20
May 5 08:59:04 c1blade1 kernel: [<ffffffff811eba67>] blkdev_direct_IO+0x57/0x60
May 5 08:59:04 c1blade1 kernel: [<ffffffff811ea9f0>] ? blkdev_get_block+0x0/0x20
May 5 08:59:04 c1blade1 kernel: [<ffffffff811343c8>] mapping_direct_IO+0x48/0x70
May 5 08:59:04 c1blade1 kernel: [<ffffffff81137e5b>] generic_file_read_iter+0x60b/0x680
May 5 08:59:04 c1blade1 kernel: [<ffffffff81442e5d>] ? dm_blk_open+0x1d/0x80
May 5 08:59:04 c1blade1 kernel: [<ffffffff811ec384>] ? __blkdev_get+0x1a4/0x3f0
May 5 08:59:04 c1blade1 kernel: [<ffffffff81137f5b>] generic_file_aio_read+0x8b/0xa0
May 5 08:59:04 c1blade1 kernel: [<ffffffff811eae31>] blkdev_aio_read+0x51/0x80
May 5 08:59:04 c1blade1 kernel: [<ffffffff811ac0ba>] do_sync_read+0xfa/0x140
May 5 08:59:04 c1blade1 kernel: [<ffffffff810a2720>] ? autoremove_wake_function+0x0/0x40
May 5 08:59:04 c1blade1 kernel: [<ffffffff811ead8c>] ? block_ioctl+0x3c/0x40
May 5 08:59:04 c1blade1 kernel: [<ffffffff811c1b4a>] ? do_vfs_ioctl+0x8a/0x5d0
May 5 08:59:04 c1blade1 kernel: [<ffffffff811ac985>] vfs_read+0xb5/0x1a0
May 5 08:59:04 c1blade1 kernel: [<ffffffff811acac1>] sys_read+0x51/0x90
May 5 08:59:04 c1blade1 kernel: [<ffffffff8100b102>] system_call_fastpath+0x16/0x1b
May 5 08:59:04 c1blade1 kernel: INFO: task umount:134980 blocked for more than 120 seconds.
May 5 08:59:04 c1blade1 kernel: Not tainted 2.6.32-29-pve #1
May 5 08:59:04 c1blade1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 5 08:59:04 c1blade1 kernel: umount D ffff88094f409330 0 134980 134744 0 0x00000000
May 5 08:59:04 c1blade1 kernel: ffff88094f553d98 0000000000000086 0000000000000000 ffff88064f523f00
May 5 08:59:04 c1blade1 kernel: 0000000000001d6d ffff8801d27c8f68 ffff88094f553d38 ffffffffa0752b0f
May 5 08:59:04 c1blade1 kernel: 0000000000000000 0000000104526287 ffff88094f4098f8 000000000001ec80
May 5 08:59:04 c1blade1 kernel: Call Trace:
May 5 08:59:04 c1blade1 kernel: [<ffffffffa0752b0f>] ? gfs2_log_write_buf+0xaf/0xd0 [gfs2]
May 5 08:59:04 c1blade1 kernel: [<ffffffff8155c5d3>] io_schedule+0x73/0xc0
May 5 08:59:04 c1blade1 kernel: [<ffffffffa0752339>] gfs2_log_flush+0x3e9/0x600 [gfs2]
May 5 08:59:04 c1blade1 kernel: [<ffffffff810a2720>] ? autoremove_wake_function+0x0/0x40
May 5 08:59:04 c1blade1 kernel: [<ffffffffa07526a8>] gfs2_meta_syncfs+0x18/0x50 [gfs2]
May 5 08:59:04 c1blade1 kernel: [<ffffffffa0758a0d>] gfs2_kill_sb+0x2d/0x80 [gfs2]
May 5 08:59:04 c1blade1 kernel: [<ffffffff811af489>] deactivate_super+0x79/0x90
May 5 08:59:04 c1blade1 kernel: [<ffffffff811cfc2f>] mntput_no_expire+0xbf/0x110
May 5 08:59:04 c1blade1 kernel: [<ffffffff811d0862>] sys_umount+0x82/0x3e0
May 5 08:59:04 c1blade1 kernel: [<ffffffff8100b102>] system_call_fastpath+0x16/0x1b

And thas it's my actual version i use the old kernel and gfs2 work properly.


pveversion -v
proxmox-ve-2.6.32: 3.2-126 (running kernel: 2.6.32-27-pve)
pve-manager: 3.2-4 (running version: 3.2-4/e24a91c1)
pve-kernel-2.6.32-27-pve: 2.6.32-121
pve-kernel-2.6.32-29-pve: 2.6.32-126
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-16
pve-firmware: 1.1-3
libpve-common-perl: 3.0-18
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve5
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-8
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1

I hope solve this problem with the kernel.

Thanks.
 
Thanks for your work! I run into the same problem and this thread helped me. The GSF2 works only fine with thei -27 Kernel.

A little addition:
I had the same problem with the following Kernes too:
2.6.32-29-pve
2.6.32-28-pve
2.6.32-26-pve

Do somebody know the reason for this Problem?
 
I have the same problem!

proxmox-ve-2.6.32: 3.2-126 (running kernel: 2.6.32-29-pve)
pve-manager: 3.2-4 (running version: 3.2-4/e24a91c1)
pve-kernel-2.6.32-29-pve: 2.6.32-126
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-16
pve-firmware: 1.1-3
libpve-common-perl: 3.0-18
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve5
vzprocps: not correctly installed
vzquota: 3.1-2
pve-qemu-kvm: 1.7-8
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1
 
still occuring with last kernel 2.6.32-30-pve from proxmox 3.2

root@calcium:~# pveversion -v
proxmox-ve-2.6.32: 3.2-129 (running kernel: 2.6.32-30-pve)
pve-manager: 3.2-4 (running version: 3.2-4/e24a91c1)
pve-kernel-2.6.32-20-pve: 2.6.32-100
pve-kernel-2.6.32-30-pve: 2.6.32-130
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-16
pve-firmware: 1.1-3
libpve-common-perl: 3.0-18
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve5
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-8
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1

Had to revert back to pve-kernel-2.6.32-27-pve: 2.6.32-121
 
Hi,

same problem here. I use Proxmox with DRDB on a lvm volume. Mounting the GFS2 volume works on both nodes. But when a write operation is done to the volume a kernel-panic occurs. E. g. vi /mnt/gfs2/test kills the node.

Reverting back to pve-kernel-2.6.32.27-pve works for GFS2. But then the web-interface isn't working anymore...

Tested with the latest kernel:
Code:
proxmox-ve-2.6.32: 3.3-139 (running kernel: 2.6.32-34-pve)
pve-manager: 3.3-5 (running version: 3.3-5/bfebec03)
pve-kernel-2.6.32-27-pve: 2.6.32-121
pve-kernel-2.6.32-34-pve: 2.6.32-140
pve-kernel-2.6.32-26-pve: 2.6.32-114
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-1
pve-cluster: 3.0-15
qemu-server: 3.3-3
pve-firmware: 1.1-3
libpve-common-perl: 3.0-19
libpve-access-control: 3.0-15
libpve-storage-perl: 3.0-25
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.1-10
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!