Centos 7 crashes on pve 5.4.78-1

hosting

Member
Jan 9, 2020
10
1
23
36
Hello,

We have 4 nodes running in cluster. 2 are with 5.4.78-1, 2 with 5.4.73-1.
Centos 7 (xfs) is always crashing on the first reboot on 5.4.78-1 while is absolutely fine on 5.4.73-1.
Not storage related, all nodes are connected to same storage, we are running OCFS2 on top of ISCSI on 10 Gb/s.
Using latest Centos 7 cloud image

Question: are you aware about the issue? Is it worse to upgrade 5.4.78-1 to some later versions?
 
Last edited:
Can you try kernel 5.4.78-2? And what's in the kernel log (dmesg)?
pve-kernel (5.4.78-2) pve pmg; urgency=medium

* revert commit 552b270b5784dc3 "scsi: be2iscsi: Fix a theoretical leak in
beiscsi_create_eqs()" to avoid hangs and kernel oopses on module load

* cherry-pick patch to allow unprivileged whiteout device creation

* bump ABI to 5.4.78-2

-- Proxmox Support Team <support@proxmox.com> Thu, 03 Dec 2020 14:26:17 +0100
 
Same thing on 5.4.78-2, during deployment of cloud image:

Code:
[  129.207689] cloud-init[1086]: Cleanup    : glib2-2.56.1-7.el7.x86_64                                  76/78
[  129.238270] cloud-init[1086]: Cleanup    : freetype-2.8-14.el7.x86_64                                 77/78
[  148.053169] XFS (sda1): Metadata corruption detected at xfs_inode_buf_verify+0x14d/0x160 [xfs], xfs_inode block 0x1000560 xfs_inode_buf_verify
[  148.054396] XFS (sda1): Unmount and run xfs_repair
[  148.054846] XFS (sda1): First 128 bytes of corrupted metadata buffer:
[  148.055452] ffff918d5f1d7000: b8 00 00 00 00 48 c7 c1 ff ff ff ff 48 89 d7 f2  .....H......H...
[  148.056258] ffff918d5f1d7010: ae 48 89 c8 48 f7 d0 48 83 e8 01 48 89 44 24 20  .H..H..H...H.D$
[  148.057071] ffff918d5f1d7020: 0f 84 2d 01 00 00 4d 85 e4 74 52 4c 89 e6 48 89  ..-...M..tRL..H.
[  148.057883] ffff918d5f1d7030: d7 e8 bc f6 ff ff 85 c0 0f 84 15 01 00 00 b8 00  ................
[  148.058698] ffff918d5f1d7040: 00 00 00 48 c7 c1 ff ff ff ff 4c 89 e7 f2 ae 48  ...H......L....H
[  148.059509] ffff918d5f1d7050: 89 c8 48 f7 d0 4c 8d 70 ff 48 8b 44 24 20 48 83  ..H..L.p.H.D$ H.
[  148.060321] ffff918d5f1d7060: c0 02 48 89 44 24 50 49 8d 34 06 4c 89 e7 e8 7d  ..H.D$PI.4.L...}
[  148.061128] ffff918d5f1d7070: 67 ff ff 49 89 c4 48 85 c0 75 64 eb 1a 48 8b 44  g..I..H..ud..H.D
[  148.062527] XFS (sda1): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x1000560 len 32 error 117
[  148.063439] XFS (sda1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -117.
[  148.064141] XFS (sda1): xfs_do_force_shutdown(0x8) called from line 3411 of file fs/xfs/xfs_inode.c.  Return address = ffffffffc02edb60
[  148.078114] XFS (sda1): Corruption of in-memory data detected.  Shutting down filesystem
[  148.078897] XFS (sda1): Please umount the filesystem and rectify the problem(s)
[  148.017732] cloud-init[1086]: Cleanup    : 2:vim-minimal-7.4.629-7.el7.x86_64                         78/78Error occurred during writing of a log message: Input/output error
[  148.019465] cloud-init[1086]: Non-fatal POSTTRANS scriptlet failure in rpm package kernel-3.10.0-1160.11.1.el7.x86_64
[  148.020398] cloud-init[1086]: 2020-12-22 17:31:36,710 - util.py[WARNING]: Package upgrade failed
[  148.021190] cloud-init[1086]: FALLBACK: 2020-12-22 17:31:36,710 - util.py[WARNING]: Package upgrade failed
[  148.022051] cloud-init[1086]: FALLBACK: 2020-12-22 17:31:36,711 - util.py[DEBUG]: Package upgrade failed
[  148.022892] cloud-init[1086]: Traceback (most recent call last):
[  148.023517] cloud-init[1086]: File "/usr/lib/python2.7/site-packages/cloudinit/config/cc_package_update_upgrade_install.py", line 92, in handle
[  148.024555] cloud-init[1086]: File "/usr/lib/python2.7/site-packages/cloudinit/distros/rhel.py", line 176, in package_command
[  148.025482] cloud-init[1086]: File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 2084, in subp[  148.131330] audit: netlink_unicast sending to audit_pid=484 returned error: -111
[  148.132063] audit: audit_lost=1 audit_rate_limit=0 audit_backlog_limit=8192
[  148.132729] audit: audit_pid=484 reset



Booted kernel 5.4.73-1 on same node, works like a charm

Definitely latest kernel`s issue
 
Last edited:
Can you try and install the kernels from Ubuntu (pve-kernel is based upon) between 5.4.73 and v5.4.78? This can narrow down the range of commits.
https://kernel.ubuntu.com/~kernel-ppa/mainline/

Any issue with an newer kernel inside the VM? And what is the qm config of the VM?
 
Issue is still there on 5.4.106-1, after first setup of vm with Centos 7 on reboot:

Code:
[   29.762903] XFS (sda1): First 128 bytes of corrupted metadata buffer:
[   29.764044] ffff9f36f2b0b000: 00 00 00 00 00 00 00 00 88 d9 1a 00 00 00 00 00  ................
[   29.765676] ffff9f36f2b0b010: 07 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00  ................
[   29.767468] ffff9f36f2b0b020: 90 d9 1a 00 00 00 00 00 07 00 00 00 11 00 00 00  ................
[   29.769099] ffff9f36f2b0b030: 00 00 00 00 00 00 00 00 98 d9 1a 00 00 00 00 00  ................
[   29.770793] ffff9f36f2b0b040: 07 00 00 00 12 00 00 00 00 00 00 00 00 00 00 00  ................
[   29.772509] ffff9f36f2b0b050: a0 d9 1a 00 00 00 00 00 07 00 00 00 13 00 00 00  ................
[   29.774117] ffff9f36f2b0b060: 00 00 00 00 00 00 00 00 a8 d9 1a 00 00 00 00 00  ................
[   29.775926] ffff9f36f2b0b070: 07 00 00 00 14 00 00 00 00 00 00 00 00 00 00 00  ................
[   29.777701] XFS (sda1): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x141ab80 len 32 error 117
[   29.779588] XFS (sda1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -117.
[   29.799585] XFS (sda1): Metadata corruption detected at xfs_inode_buf_verify+0x14d/0x160 [xfs], xfs_inode block 0x141ac80 xfs_inode_buf_verify
[   29.801976] XFS (sda1): Unmount and run xfs_repair
[   29.803001] XFS (sda1): First 128 bytes of corrupted metadata buffer:
[   29.804219] ffff9f36f7485000: c3 0f 1f 80 00 00 00 00 e8 83 ce fd ff 48 89 ef  .............H..
[   29.806033] ffff9f36f7485010: 49 89 c5 e8 78 ce fd ff 48 89 c1 0f 1f 44 00 00  I...x...H....D..
[   29.807841] ffff9f36f7485020: 48 89 da 48 8b 5b 60 48 85 db 75 f4 4d 89 e8 4c  H..H.[`H..u.M..L
[   29.809458] ffff9f36f7485030: 89 e7 31 c0 48 83 c2 68 48 8d 35 71 bc 10 00 e8  ..1.H..hH.5q....
[   29.811194] ffff9f36f7485040: fc 99 02 00 48 83 c4 08 b8 ff ff ff ff 5b 5d 41  ....H........[]A
[   29.812850] ffff9f36f7485050: 5c 41 5d c3 0f 1f 40 00 8b 05 52 18 1a 00 39 45  \A]...@...R...9E
[   29.814464] ffff9f36f7485060: 10 0f 85 33 ff ff ff 31 c0 e9 31 ff ff ff 66 90  ...3...1..1...f.
[   29.816078] ffff9f36f7485070: 48 8b 73 20 4c 33 47 08 48 31 d6 49 09 f0 75 19  H.s L3G.H1.I..u.
[   29.818104] XFS (sda1): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x141ac80 len 32 error 117
[   29.820056] XFS (sda1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -117.
[   29.821544] XFS (sda1): xfs_iunlink_remove: xfs_imap_to_bp returned error -117.
[   29.823011] XFS (sda1): xfs_inactive_ifree: xfs_ifree returned error -117
[   29.824087] XFS (sda1): xfs_do_force_shutdown(0x1) called from line 1753 of file fs/xfs/xfs_inode.c.  Return address = ffffffffc0596343
[   29.828169] XFS (sda1): I/O Error Detected. Shutting down filesystem
[   29.829203] XFS (sda1): Please umount the filesystem and rectify the problem(s)
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!