The first being the most relevant. On just 1 machine of all, we hit this issue where it popped up during update to latest 7.4-x ahead of 7to8, that grub was unable to install onto /dev/sda and that this may result in an unbootable system. If we chose no, it would just wind back to that same screen/prompt.
However if we exit then it finished the update and says please reboot for the new kernel. But we don't want to... what if it doesn't come up?
Doing an apt autoremove gets rid of 1 kernel but doesn't change the behaviour, same if we try to run grub.install-real /dev/sda we get:
Code:
# grub-install.real /dev/sda
Installing for i386-pc platform.
grub-install.real: error: disk `lvmid/hSR5FY-F3Nd-Dr52-Dqlk-ooBd-W5rj-r9c9kP/smq4EU-juQG-ft5n-gqMV-7NjL-aUSW-bHJxCW' not found.
df shows disk usage as minimal. The server isn't overloaded in that sense. This feels like an odd bug potentially, or a configuration issue?
We have pressed pause for now, as we don't want to carry on if there are risks ahead. Hopefully we can get to the bottom of it!
Thank you for any assistance.
Then after the OK prompt with all the text, it listed the 3 options.
/dev/sda or /dev/sda3 (LVM) or the pve-root one (dm-8?). Choosing /dev/sda did the same thing.
Hi, this might be an indication that grub won't come up after a reboot, due to a bug in its LVM metadata parser [1]. Can you please post the output of the following commands, so we can check if this is indeed the case?
Indeed I would advise against rebooting right now. Are you booting in legacy mode or UEFI mode?
EDIT: If you are on PVE 7 and the output of vgscan contains a line like Reading metadata ... (+N) where N is anything but 0, there is a wraparound in the LVM metadata buffer and grub will most likely fail to boot after a reboot. In that case, please follow the suggestion in [1] and trigger an LVM metadata update, e.g. lvcreate -L 4M pve -n grubtemp. Afterwards, there should be no wraparound anymore (vgscan shows Reading metadata ... (+0)), re-installing grub should succeed, and the host should reboot fine.
Once fixed, what is the best-practice grub-install update-grub etc method?
dpkg output:
Code:
# dpkg -l | grep grub
ii grub-common 2.06-3~deb11u6 amd64 GRand Unified Bootloader (common files)
ii grub-efi-amd64-bin 2.06-3~deb11u6 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 modules)
ii grub-efi-ia32-bin 2.06-3~deb11u6 amd64 GRand Unified Bootloader, version 2 (EFI-IA32 modules)
ii grub-pc 2.06-3~deb11u6 amd64 GRand Unified Bootloader, version 2 (PC/BIOS version)
ii grub-pc-bin 2.06-3~deb11u6 amd64 GRand Unified Bootloader, version 2 (PC/BIOS modules)
ii grub2-common 2.06-3~deb11u6 amd64 GRand Unified Bootloader (common files for version 2)
vgscan output:
Code:
# vgscan -vvv 2>&1 | grep metadata
metadata/record_lvs_history not found in config: defaulting to 0
File locking settings: readonly:0 sysinit:0 ignorelockingfailure:0 global/metadata_read_only:0 global/wait_for_locks:1.
Reading metadata summary from /dev/sda3 at 1042944 size 5632 (+2485)
Found metadata summary on /dev/sda3 at 1042944 size 8117 for VG pve
Reading VG pve metadata from /dev/sda3 4096
VG pve metadata check /dev/sda3 mda 4096 slot0 offset 1038848 size 8117
Reading metadata from /dev/sda3 at 1042944 size 5632 (+2485)
Logical volume pve/lvol0_pmspare is pool metadata spare.
Found metadata on /dev/sda3 at 1042944 size 8117 for VG pve
metadata/lvs_history_retention_time not found in config: defaulting to 0
Found volume group "pve" using metadata type lvm2
So these snippets would confirm your hunch?
grub* 2.06-3~deb11u6 on PVE 7.4-x
Reading metadata summary from /dev/sda3 at 1042944 size 5632 (+2485)
Reading metadata from /dev/sda3 at 1042944 size 5632 (+2485)
Legacy booting. XFS so the ZFS case doesn't apply. proxmox-boot-tool not yet in-use, should we do this before/after this "fix reboot"?
Code:
# efibootmgr -v
EFI variables are not supported on this system.
# findmnt /
TARGET SOURCE FSTYPE OPTIONS
/ /dev/mapper/pve-root xfs rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
# proxmox-boot-tool status
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
E: /etc/kernel/proxmox-boot-uuids does not exist.
The fix in the documentation that you've mentioned re: creating a 4MB LV for grub to single-boot with, then removing it, however what about the grub-install? As there is no working config as it advises, is there not also a need to perform that install before the reboot? Or the LV is enough?
To that end, here are additional pieces of information as you requested, even though this is PVE7. In case it helps, as it is pre-fix.
pvdisplay & vgdisplay:
Code:
# pvdisplay
--- Physical volume ---
PV Name /dev/sda3
VG Name pve
PV Size 3.49 TiB / not usable 2.98 MiB
Allocatable yes
PE Size 4.00 MiB
Total PE 915311
Free PE 4192
Allocated PE 911119
PV UUID pxtsvo-ydEJ-1ocN-dgCp-MPJ7-ovHe-1Zk6u7
# vgdisplay
--- Volume group ---
VG Name pve
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 289
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 20
Open LV 11
Max PV 0
Cur PV 1
Act PV 1
VG Size 3.49 TiB
PE Size 4.00 MiB
Total PE 915311
Alloc PE / Size 911119 / <3.48 TiB
Free PE / Size 4192 / <16.38 GiB
VG UUID hSR5FY-F3Nd-Dr52-Dqlk-ooBd-W5rj-r9c9kP
pvck output:
Code:
# pvck /dev/sda3 --dump headers
label_header at 512
label_header.id LABELONE
label_header.sector 1
label_header.crc 0x16f39267
label_header.offset 32
label_header.type LVM2 001
pv_header at 544
pv_header.pv_uuid pxtsvoydEJ1ocNdgCpMPJ7ovHe1Zk6u7
pv_header.device_size 3839095717376
pv_header.disk_locn[0] at 584 # location of data area
pv_header.disk_locn[0].offset 1048576
pv_header.disk_locn[0].size 0
pv_header.disk_locn[1] at 600 # location list end
pv_header.disk_locn[1].offset 0
pv_header.disk_locn[1].size 0
pv_header.disk_locn[2] at 616 # location of metadata area
pv_header.disk_locn[2].offset 4096
pv_header.disk_locn[2].size 1044480
pv_header.disk_locn[3] at 632 # location list end
pv_header.disk_locn[3].offset 0
pv_header.disk_locn[3].size 0
pv_header_extension at 648
pv_header_extension.version 2
pv_header_extension.flags 1
pv_header_extension.disk_locn[0] at 656 # location list end
pv_header_extension.disk_locn[0].offset 0
pv_header_extension.disk_locn[0].size 0
mda_header_1 at 4096 # metadata area
mda_header_1.checksum 0x8738c460
mda_header_1.magic 0x204c564d3220785b35412572304e2a3e
mda_header_1.version 1
mda_header_1.start 4096
mda_header_1.size 1044480
mda_header_1.raw_locn[0] at 4136 # commit wrapped
mda_header_1.raw_locn[0].offset 1038848
mda_header_1.raw_locn[0].size 8117
mda_header_1.raw_locn[0].checksum 0x7bc97246
mda_header_1.raw_locn[0].flags 0x0
mda_header_1.raw_locn[1] at 4160 # precommit
mda_header_1.raw_locn[1].offset 0
mda_header_1.raw_locn[1].size 0
mda_header_1.raw_locn[1].checksum 0x0
mda_header_1.raw_locn[1].flags 0x0
metadata text at 1042944 crc 0x7bc97246 # vgname pve seqno 289
grub outputs:
Code:
# grub-fstest --version
grub-fstest (GRUB) 2.06-3~deb11u6
# grub-fstest -v /dev/sda3 ls
grub-fstest: info: Scanning for DISKFILTER devices on disk proc.
grub-fstest: info: Scanning for mdraid1x devices on disk proc.
grub-fstest: info: Scanning for mdraid09 devices on disk proc.
grub-fstest: info: Scanning for mdraid09_be devices on disk proc.
grub-fstest: info: Scanning for dmraid_nv devices on disk proc.
grub-fstest: info: Scanning for lvm devices on disk proc.
grub-fstest: info: Scanning for ldm devices on disk proc.
grub-fstest: info: scanning proc for LDM.
grub-fstest: info: no LDM signature found.
grub-fstest: info: Scanning for DISKFILTER devices on disk loop0.
grub-fstest: info: Scanning for mdraid1x devices on disk loop0.
grub-fstest: info: Scanning for mdraid09 devices on disk loop0.
grub-fstest: info: Scanning for mdraid09_be devices on disk loop0.
grub-fstest: info: Scanning for dmraid_nv devices on disk loop0.
grub-fstest: info: Scanning for lvm devices on disk loop0.
grub-fstest: info: unknown LVM type thin-pool.
grub-fstest: info: unknown LVM type thin.
grub-fstest: info: unknown LVM type thin.
grub-fstest: info: unknown LVM type thin.
grub-fstest: info: unknown LVM type thin.
grub-fstest: info: unknown LVM type thin.
grub-fstest: info: unknown LVM type thin.
grub-fstest: info: unknown LVM type thin.
grub-fstest: info: unknown LVM type thin.
grub-fstest: info: unknown LVM type thin.
grub-fstest: info: unknown LVM type thin.
grub-fstest: info: unknown LVM type thin.
grub-fstest: info: unknown LVM type thin.
grub-fstest: info: unknown LVM type thin.
grub-fstest: info: Scanning for ldm devices on disk loop0.
grub-fstest: info: scanning loop0 for LDM.
grub-fstest: info: no LDM signature found.
grub-fstest: info: Scanning for DISKFILTER devices on disk host.
grub-fstest: info: Scanning for mdraid1x devices on disk host.
grub-fstest: info: Scanning for mdraid09 devices on disk host.
grub-fstest: info: Scanning for mdraid09_be devices on disk host.
grub-fstest: info: Scanning for dmraid_nv devices on disk host.
grub-fstest: info: Scanning for lvm devices on disk host.
grub-fstest: info: Scanning for ldm devices on disk host.
grub-fstest: info: scanning host for LDM.
grub-fstest: info: no LDM signature found.
(proc) (loop0) (host)
We are refraining from running the fix, as we'd like to assist should you want further information. (EDIT: I see this is more for PVE 8 you want it for)
Please advise when you have the file and if you want it (as this is PVE 7.4 with older grub ver), so we can delete it from this post ASAP. Thank you!
Yes, I would say it is clear that the host is affected by the grub bug I mentioned, because it is running PVE 7 (and the version of grub-pc is 2.06-3~deb11u6), and there is a wraparound in the metadata ring buffer:
Reading metadata summary from /dev/sda3 at 1042944 size 5632 (+2485)
I have just updated the wiki page [1] with more detailed information regarding this bug and the differences between PVE 7 and 8, and posted an update to the other thread [2].
In your case, as you are booting in legacy mode, I would suggest the following:
Create a small logical volume to trigger a metadata update: lvcreate -L 4M pve -n grubtemp, verify that vgscan -vvv does not indicate a wraparound anymore, i.e., prints Reading metadata summary from /dev/sda3 ... (+0)
Reinstall grub-pc to trigger a grub-install to /dev/sda: apt install --reinstall grub-pc. This should now work without errors.
Then, a reboot should be safe. As noted in [1], this is only a temporary workaround for PVE 7 hosts, as grub will fail to boot when there is a wraparound again. The only permanent fix is to upgrade to PVE 8.
As you are booting in legacy mode, rebooting after the upgrade to PVE 8 should properly boot the new grub code in which the LVM metadata parsing bug is fixed. If the system was booting in UEFI mode, manual steps would be necessary as described (now) in the upgrade guide [1].
You can also keep the LV until after the successful upgrade to PVE 8. When there is a wraparound in the metadata ring buffer, the point of creating a small LV is to make some change to the LVM metadata, so that the newly written metadata (with that new LV) does not wraparound anymore and grub 2.06-3 has no problem parsing it. If you want to check the vgscan output for the wraparound, I would do so right immediately before the reboot. I would recommend against making any LVM metadata changes between checking vgscan and rebooting, as those might in the worst case even push the metadata to the wraparound case again.
Sorry, overlooked that one. If you have feedback regarding the dark mode in Proxmox products or the forum, I'd say this thread [1] would be a better place to post it.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.