Proxmox boot loops after upgrade from 6.4 to 7.0 (Possible kernel issue)

Jul 8, 2021
9
0
1
27
Indiana, USA
Problem
I attempted to update my Proxmox machine from the latest 6.4 release to the new 7.0 version this morning, but my installation boot loops now. Specifically, shortly after successfully decrypting the boot drives (setup with the following tutorial without the remote unlock setup) I see a rapid output of the standard text you see upon a normal boot (a console cable will arrive tomorrow, but using my phone's high speed video I don't see any errors) then the machine reboots itself. When I switch to the old kernel (5.4 vs 5.11), I do not encounter this issue. Additionally, after I execute update-initramfs -u -k all && pve-efiboot-tool refresh I still observe the problem.

History
This machine has been running Proxmox 6.x since around late March of this year. I have updated it consistently and have not observed any issues with it.

The boot drives are encrypted which is not supported, but given that has not been an issue in the past and the fact it unlocks them successfully I don't think that is the source of the problem.

I ran an update (included a kernel update) immediately prior and rebooted the machine successfully via the web GUI. Additionally, I ran pve6to7 --full via ssh and got all green except for 3 skipped (likely as I don't use Ceph).

I conducted the upgrade via ssh and nothing appeared out of the ordinary other than it wanting to modify the grub configuration (adding Debian instead of Proxmox to the name), but I opted to keep all local changes when asked. I attached the logs of the update, but it is missing the initial part as the terminal history I used was insufficient.
 

Attachments

Update: Just attempted a fresh install of 6.4 (with subscription) to 7.0 upgrade following the official documentation and got the same issue. Attached are the full logs from the SSH session if you are curious.

EDIT: Also attached the output of journalctl as well where you can see multiple reboots with 5.11 (new kernel), but with no obvious errors beforehand.
 

Attachments

Last edited:
the 5.11 logs cut off right where the rootfs is remounted - could you try checking with netconsole or a serial cable whether there really is no output after that?
 
Sorry for the late reply, but I have made some progress. It seems the issue was a BIOS setting as I was able to reset to default settings and it started working. After I reverted to (the best of my knowledge) the old configs, the problem returns. I am currently in the process of rolling through the configs to isolate it, but so far have not found the problem setting yet.

I may have a serial cable lying around somewhere though, so that might speed things up.
 
Update: still slowly going through things, but so far it is not IOMMU as I enabled those settings again and have hardware passthrough two GPU's working. I didn't enable the alternate PCI-e routing settings though, so perhaps those are the issue still. I will do a full dive this weekend to figure it out.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!