[SOLVED] Proxmox Host hangs at load screen

Something is asking it to terminate...
https://www.fosslinux.com/121761/the-abcs-of-linux-signals-sigint-sigterm-and-sigkill-explained.htm



Alright then, how about posting output for longer journal entries back and instead of -b -1 (means last boot) you just do something like --since="2024-07-10" or simply --since "5 days ago" instead.



I do not like to guess unless I have to, I would reckon something would be in the first boot log, i.e. first time it did not proceed to boot properly about why. Then start from there.
Here's the logs as requested (had to use my google drive due to file size):
https://drive.google.com/file/d/1r_yeIwLQJB5wLkx2t_kfzgoRWhFSx1f6/view?usp=drive_link

Perhaps it would be more useful if I simply did a clean reboot, get to the hung screen, then grab that log?
 
Last edited:
Here's the logs as requested (had to use my google drive due to file size):
https://drive.google.com/file/d/1r_yeIwLQJB5wLkx2t_kfzgoRWhFSx1f6/view?usp=drive_link

So there's different scenarios happening there, I guess in some instances you also just "pulled the plug" on it I guess.

Anyhow there's e.g.

Code:
Jul 09 17:23:17 pve systemd[1]: corosync.service - Corosync Cluster Engine was skipped because of an unmet condition check (ConditionPathExists=/etc/corosync/corosync.conf).

Which possibly corresponds to the part where you mention:

Here is what I tried:

1. In Grub, removed quiet and amd_immou = on . It loads further but hangs here:
View attachment 71097

Now I do not know if there was additional problem you created there, but one thing that captured my attention (only now) is that you mention that you "removed" GRUB option ...

So for this you should actually have not removed it (it's not a valid option I believe) but had it as amd_iommu=off so you can try that with the next boot and indeed ...

Perhaps it would be more useful if I simply did a clean reboot, get to the hung screen, then grab that log?

It would be good to capture log from one boot with all known conditions (without rushing to reboot manually) and see why it's not booting up (after all the tries) by itself.
 
So there's different scenarios happening there, I guess in some instances you also just "pulled the plug" on it I guess.

Anyhow there's e.g.

Code:
Jul 09 17:23:17 pve systemd[1]: corosync.service - Corosync Cluster Engine was skipped because of an unmet condition check (ConditionPathExists=/etc/corosync/corosync.conf).

Which possibly corresponds to the part where you mention:



Now I do not know if there was additional problem you created there, but one thing that captured my attention (only now) is that you mention that you "removed" GRUB option ...

So for this you should actually have not removed it (it's not a valid option I believe) but had it as amd_iommu=off so you can try that with the next boot and indeed ...



It would be good to capture log from one boot with all known conditions (without rushing to reboot manually) and see why it's not booting up (after all the tries) by itself.
I turned off the server and booted it up as how it was when I first encountered these problems. Waited a few minutes before shutting it off. Normally it boots up in like 30 seconds. Here are the log files:

Second last log file: https://pastebin.com/8PCRyGgH

Here is the last boot log (though it might be the one that boots into the live CD, hence why I included the one previous): https://pastebin.com/CN0XY1B6
 
I turned off the server and booted it up as how it was when I first encountered these problems. Waited a few minutes before shutting it off. Normally it boots up in like 30 seconds. Here are the log files:

Second last log file: https://pastebin.com/8PCRyGgH

Here is the last boot log (though it might be the one that boots into the live CD, hence why I included the one previous): https://pastebin.com/CN0XY1B6

Alright and what's not working exactly now? :) Also, can you SSH in?
 
Alright and what's not working exactly now? :) Also, can you SSH in?

It just hangs indefinitely at this screen and does not get to the shell:
1720986229549.png

webgui doesn't work. I haven't tried to ssh to the proxmox host before (usually just access it through the webgui).

Did the logs not reveal anything?
 
Alright and what's not working exactly now? :) Also, can you SSH in?
So I check the network settings because "maybe somehow the Proxmox host address was changed?". I guess somehow my host ip was changed because it was very different. When I go to that new web address, I am able to access the webgui....

Sorry for wasting your time. On the bright side, I did enjoy going through logs and googling everything.
 
Haha I was just going to reply - the logs looked ok, I would normally go check networking because that machine looks happy.

If you have display issue it may have to do with the iommu (one log you included had it off the other not again). But the machine looked up and running ... so anyhow good thread for anyone to follow!
 
On a separate node, don't you use a static IP for PVE?

PS Always check ping, then SSH, then GUI ... at times you may only have e.g. cache browser problem, worrying about GUI is the last thing ...
 
Haha I was just going to reply - the logs looked ok, I would normally go check networking because that machine looks happy.

If you have display issue it may have to do with the iommu (one log you included had it off the other not again). But the machine looked up and running ... so anyhow good thread for anyone to follow!
If I shut down, and start it cold, my bios disables iommu. I always need to go into the bios and enable it upon a cold boot for some reason. It's like the setting never stays saved I guess.

I am just unsure why the Proxmox ip changed tbh. Seems odd.

Edit: Seems the ips of all my VMs changed too.
 
If I shut down, and start it cold, my bios disables iommu. I always need to go into the bios and enable it upon a cold boot for some reason. It's like the setting never stays saved I guess.

In some cases (I remember Gigabyte) you would have to enable in two different places to keep it not revert after reboot, but no familiar with Asrock.

I am just unsure why the Proxmox ip changed tbh. Seems odd.

So what does cat /etc/network/interfaces and ip a show?

Edit: Seems the ips of all my VMs changed too.

That one would be about how you have them configured, you have not shared anything about your network setup so far...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!