Centos 7 crashes on pve 5.4.78-1

hosting

Member
Jan 9, 2020
10
1
23
36
Hello,

We have 4 nodes running in cluster. 2 are with 5.4.78-1, 2 with 5.4.73-1.
Centos 7 (xfs) is always crashing on the first reboot on 5.4.78-1 while is absolutely fine on 5.4.73-1.
Not storage related, all nodes are connected to same storage, we are running OCFS2 on top of ISCSI on 10 Gb/s.
Using latest Centos 7 cloud image

Question: are you aware about the issue? Is it worse to upgrade 5.4.78-1 to some later versions?
 
Last edited:
Can you try kernel 5.4.78-2? And what's in the kernel log (dmesg)?
pve-kernel (5.4.78-2) pve pmg; urgency=medium

* revert commit 552b270b5784dc3 "scsi: be2iscsi: Fix a theoretical leak in
beiscsi_create_eqs()" to avoid hangs and kernel oopses on module load

* cherry-pick patch to allow unprivileged whiteout device creation

* bump ABI to 5.4.78-2

-- Proxmox Support Team <support@proxmox.com> Thu, 03 Dec 2020 14:26:17 +0100
 
Same thing on 5.4.78-2, during deployment of cloud image:

Code:
[  129.207689] cloud-init[1086]: Cleanup    : glib2-2.56.1-7.el7.x86_64                                  76/78
[  129.238270] cloud-init[1086]: Cleanup    : freetype-2.8-14.el7.x86_64                                 77/78
[  148.053169] XFS (sda1): Metadata corruption detected at xfs_inode_buf_verify+0x14d/0x160 [xfs], xfs_inode block 0x1000560 xfs_inode_buf_verify
[  148.054396] XFS (sda1): Unmount and run xfs_repair
[  148.054846] XFS (sda1): First 128 bytes of corrupted metadata buffer:
[  148.055452] ffff918d5f1d7000: b8 00 00 00 00 48 c7 c1 ff ff ff ff 48 89 d7 f2  .....H......H...
[  148.056258] ffff918d5f1d7010: ae 48 89 c8 48 f7 d0 48 83 e8 01 48 89 44 24 20  .H..H..H...H.D$
[  148.057071] ffff918d5f1d7020: 0f 84 2d 01 00 00 4d 85 e4 74 52 4c 89 e6 48 89  ..-...M..tRL..H.
[  148.057883] ffff918d5f1d7030: d7 e8 bc f6 ff ff 85 c0 0f 84 15 01 00 00 b8 00  ................
[  148.058698] ffff918d5f1d7040: 00 00 00 48 c7 c1 ff ff ff ff 4c 89 e7 f2 ae 48  ...H......L....H
[  148.059509] ffff918d5f1d7050: 89 c8 48 f7 d0 4c 8d 70 ff 48 8b 44 24 20 48 83  ..H..L.p.H.D$ H.
[  148.060321] ffff918d5f1d7060: c0 02 48 89 44 24 50 49 8d 34 06 4c 89 e7 e8 7d  ..H.D$PI.4.L...}
[  148.061128] ffff918d5f1d7070: 67 ff ff 49 89 c4 48 85 c0 75 64 eb 1a 48 8b 44  g..I..H..ud..H.D
[  148.062527] XFS (sda1): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x1000560 len 32 error 117
[  148.063439] XFS (sda1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -117.
[  148.064141] XFS (sda1): xfs_do_force_shutdown(0x8) called from line 3411 of file fs/xfs/xfs_inode.c.  Return address = ffffffffc02edb60
[  148.078114] XFS (sda1): Corruption of in-memory data detected.  Shutting down filesystem
[  148.078897] XFS (sda1): Please umount the filesystem and rectify the problem(s)
[  148.017732] cloud-init[1086]: Cleanup    : 2:vim-minimal-7.4.629-7.el7.x86_64                         78/78Error occurred during writing of a log message: Input/output error
[  148.019465] cloud-init[1086]: Non-fatal POSTTRANS scriptlet failure in rpm package kernel-3.10.0-1160.11.1.el7.x86_64
[  148.020398] cloud-init[1086]: 2020-12-22 17:31:36,710 - util.py[WARNING]: Package upgrade failed
[  148.021190] cloud-init[1086]: FALLBACK: 2020-12-22 17:31:36,710 - util.py[WARNING]: Package upgrade failed
[  148.022051] cloud-init[1086]: FALLBACK: 2020-12-22 17:31:36,711 - util.py[DEBUG]: Package upgrade failed
[  148.022892] cloud-init[1086]: Traceback (most recent call last):
[  148.023517] cloud-init[1086]: File "/usr/lib/python2.7/site-packages/cloudinit/config/cc_package_update_upgrade_install.py", line 92, in handle
[  148.024555] cloud-init[1086]: File "/usr/lib/python2.7/site-packages/cloudinit/distros/rhel.py", line 176, in package_command
[  148.025482] cloud-init[1086]: File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 2084, in subp[  148.131330] audit: netlink_unicast sending to audit_pid=484 returned error: -111
[  148.132063] audit: audit_lost=1 audit_rate_limit=0 audit_backlog_limit=8192
[  148.132729] audit: audit_pid=484 reset



Booted kernel 5.4.73-1 on same node, works like a charm

Definitely latest kernel`s issue
 
Last edited:
Can you try and install the kernels from Ubuntu (pve-kernel is based upon) between 5.4.73 and v5.4.78? This can narrow down the range of commits.
https://kernel.ubuntu.com/~kernel-ppa/mainline/

Any issue with an newer kernel inside the VM? And what is the qm config of the VM?
 
Issue is still there on 5.4.106-1, after first setup of vm with Centos 7 on reboot:

Code:
[   29.762903] XFS (sda1): First 128 bytes of corrupted metadata buffer:
[   29.764044] ffff9f36f2b0b000: 00 00 00 00 00 00 00 00 88 d9 1a 00 00 00 00 00  ................
[   29.765676] ffff9f36f2b0b010: 07 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00  ................
[   29.767468] ffff9f36f2b0b020: 90 d9 1a 00 00 00 00 00 07 00 00 00 11 00 00 00  ................
[   29.769099] ffff9f36f2b0b030: 00 00 00 00 00 00 00 00 98 d9 1a 00 00 00 00 00  ................
[   29.770793] ffff9f36f2b0b040: 07 00 00 00 12 00 00 00 00 00 00 00 00 00 00 00  ................
[   29.772509] ffff9f36f2b0b050: a0 d9 1a 00 00 00 00 00 07 00 00 00 13 00 00 00  ................
[   29.774117] ffff9f36f2b0b060: 00 00 00 00 00 00 00 00 a8 d9 1a 00 00 00 00 00  ................
[   29.775926] ffff9f36f2b0b070: 07 00 00 00 14 00 00 00 00 00 00 00 00 00 00 00  ................
[   29.777701] XFS (sda1): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x141ab80 len 32 error 117
[   29.779588] XFS (sda1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -117.
[   29.799585] XFS (sda1): Metadata corruption detected at xfs_inode_buf_verify+0x14d/0x160 [xfs], xfs_inode block 0x141ac80 xfs_inode_buf_verify
[   29.801976] XFS (sda1): Unmount and run xfs_repair
[   29.803001] XFS (sda1): First 128 bytes of corrupted metadata buffer:
[   29.804219] ffff9f36f7485000: c3 0f 1f 80 00 00 00 00 e8 83 ce fd ff 48 89 ef  .............H..
[   29.806033] ffff9f36f7485010: 49 89 c5 e8 78 ce fd ff 48 89 c1 0f 1f 44 00 00  I...x...H....D..
[   29.807841] ffff9f36f7485020: 48 89 da 48 8b 5b 60 48 85 db 75 f4 4d 89 e8 4c  H..H.[`H..u.M..L
[   29.809458] ffff9f36f7485030: 89 e7 31 c0 48 83 c2 68 48 8d 35 71 bc 10 00 e8  ..1.H..hH.5q....
[   29.811194] ffff9f36f7485040: fc 99 02 00 48 83 c4 08 b8 ff ff ff ff 5b 5d 41  ....H........[]A
[   29.812850] ffff9f36f7485050: 5c 41 5d c3 0f 1f 40 00 8b 05 52 18 1a 00 39 45  \A]...@...R...9E
[   29.814464] ffff9f36f7485060: 10 0f 85 33 ff ff ff 31 c0 e9 31 ff ff ff 66 90  ...3...1..1...f.
[   29.816078] ffff9f36f7485070: 48 8b 73 20 4c 33 47 08 48 31 d6 49 09 f0 75 19  H.s L3G.H1.I..u.
[   29.818104] XFS (sda1): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x141ac80 len 32 error 117
[   29.820056] XFS (sda1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -117.
[   29.821544] XFS (sda1): xfs_iunlink_remove: xfs_imap_to_bp returned error -117.
[   29.823011] XFS (sda1): xfs_inactive_ifree: xfs_ifree returned error -117
[   29.824087] XFS (sda1): xfs_do_force_shutdown(0x1) called from line 1753 of file fs/xfs/xfs_inode.c.  Return address = ffffffffc0596343
[   29.828169] XFS (sda1): I/O Error Detected. Shutting down filesystem
[   29.829203] XFS (sda1): Please umount the filesystem and rectify the problem(s)
 
Last edited: