[SOLVED] Proxmox 8 Not fully booting when IMMOU is enabled in BIOS

northnorth

New Member
Dec 2, 2024
3
0
1
Hello, I have encountered a rather odd issue that I just can't seem to find an answer to solve it. The other day my server stopped working, turns out the CPU got fried by the motherboard. So I bought a new CPU and a new motherboard and installed them. I have enabled the options in BIOS to allow virtualization and IMMOU. However whenever the IMMOU option in BIOS is enabled Proxmox fails to fully start up. If it is disabled then the environment boots fully, but the main storage VM won't boot due to IMMOU being disabled.

The CPU I currently have installed is a Ryzen 5950x. So I have enabled IMMOU within the grub config file via "amd_immou=on".

The odd symptoms I am seeing is that when IMMOU is enabled it seems to boot fine and then as soon as the cmdline shows up the Ethernet port goes dead and cannot be contacted. Once the cmdline shows up for the login I cannot type anything. My suspicion is that something is making the system hang when IMMOU is enabled but I have no clue on how to check for these logs or where to even start.

If anyone has any clue where I can start any help would be great. Currently stuck trying to get this thing working for the past few days now. I will provide as much info as I can if needed.
 
amd_iommu=on does nothing and never did anything since it is enabled by default.
A new motherboard may have different PCI IDs for devices. This can really cause problems when you have VMs with passthrough that start automatically (and are taking the wrong devices possibly in different IOMMU groups, causing freezes or crashes).
Also, the name of the network device depends on the PCI ID (unless you changed it). and you might need to adjust /etc/network/interfaces accordingly (check with ip a).
 
amd_iommu=on does nothing and never did anything since it is enabled by default.
A new motherboard may have different PCI IDs for devices. This can really cause problems when you have VMs with passthrough that start automatically (and are taking the wrong devices possibly in different IOMMU groups, causing freezes or crashes).
Also, the name of the network device depends on the PCI ID (unless you changed it). and you might need to adjust /etc/network/interfaces accordingly (check with ip a).
Okay, I wondered about the amd_immou=on option would do anything since I had seen you didn't have to do that anymore with Proxmox 8. I have changed the PCI ID to the new one for the SAS controller within the VM config since I can get to it when IMMOU is off.

I also had to change the network id within the /etc/network/interfaces. Maybe when IMMOU is disabled/enabled the order gets shuffled? I will disable automatic startup on the VM's and then try to see if it boots up and I can adjust the interfaces accordingly.

For another thing this symptom also happened when I was using the old motherboard that fried the CPU. I put in another compatible CPU and these same symptoms would appear.

UPDATE: I have enabled IMMOU and disabled automatic startup. I was able to boot into the proxmox VE GUI and attempted to start the Storage VM. It seems to try to start but hangs and then stalls the system. So that's a start towards the right direction. The PCI IDs stayed the same and I confirmed them before starting.
 
Last edited:
I also had to change the network id within the /etc/network/interfaces. Maybe when IMMOU is disabled/enabled the order gets shuffled?
IOMMU on or off usually does not change the PCI IDs. A different CPU or motherboard revision or BIOS version can make a difference.

For another thing this symptom also happened when I was using the old motherboard that fried the CPU. I put in another compatible CPU and these same symptoms would appear.

UPDATE: I have enabled IMMOU and disabled automatic startup. I was able to boot into the proxmox VE GUI and attempted to start the Storage VM. It seems to try to start but hangs and then stalls the system. So that's a start towards the right direction. The PCI IDs stayed the same and I confirmed them before starting.
It's possible that the PCI IDs are the same but the IOMMU groups are different. Maybe update the motherboard BIOS or downgrade it to the version that worked fine?
Maybe show the VM configuration file and the system log (from the time when starting the VM), so other people can help think about this issue?
 
  • Like
Reactions: northnorth
IOMMU on or off usually does not change the PCI IDs. A different CPU or motherboard revision or BIOS version can make a difference.


It's possible that the PCI IDs are the same but the IOMMU groups are different. Maybe update the motherboard BIOS or downgrade it to the version that worked fine?
Maybe show the VM configuration file and the system log (from the time when starting the VM), so other people can help think about this issue?
I have found the issue! It was indeed the IMMOU groups. The group it was in had the ethernet and usb controller in it as well. So that makes sense why the entire system would lock up on the gui and I was unable to use the keyboard on the local cmdline. I moved the SAS controller to the top PCIe slot and that IMMOU group is isolated. I then had to change the network interface and now it fully works!

Thank you so much for your direction!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!