Proxmox hangs at boot

proxwolfe

Well-Known Member
Jun 20, 2020
501
52
48
49
Hi,

I am encountering my next problem.

Since everything was working too well, I thought I'd try GPU passthrough to my Windows VM.

I have two NVIDIA cards in my workstation: One GTX 1070 and one Quadro 2000. The latter I wanted to passthrough (to avoid code 43).

So I followed this guide:


I added the iommu-option to the kernel line in grub and the vfio modules in /etc/modules and rebooted. Afterwards I was able to list the iommu groups. So all good up to this point.

Since I have a Xeon, it has ACS enabled.

I determined the address of my quadro card and did this:

echo "options vfio-pci ids=10de:xxxx,10de:xxxx" > /etc/modprobe.d/vfio.conf​

But not that:
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf​
echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf​
because both my cards are NVidias and I still want my host to be able to use the GTX card.

After that, I rebooted.

Grub came up, handed off to proxmox and the last thing that it showed is:

"Reading all physical volumes. This may take a while...
Found volume group "pve" using metadata type lvm2
5 logical volume(s) in volume group "pve" now active
/dev/mapper/pve-root: recovering journal
/dev/mapper/pve-root: clean, 50822/18188624 files, 1990742/7274496 blocks"

It has been like this for at least 15 minutes now (actually this is the second time because I already hard reset my pc once). I cannot connect to the GUI via the browser.

Any ideas what I could do?

Thanks
 
the changes seem unrelated to the hanging boot problem... did you successfully reboot the machine before you made those changes?
try booting into an older kernel (if any) and check the journal what might have gone wrong
 
I had rebooted successfully before making the changes.

Some new information on this:

I edited out "quiet" from the kernel command line in grub and then I saw two things:

- the system spent a lot of time on my external usb 3 hard drive (but eventually moved on - so this may be totally unrelated)
- the system now hangs at "vfio-pci 0000:04:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none (it actually does hang because it isn't outputting anything, I can't reach the web GUI and I can't ssh into it)

So it would now seem to have something to do with my latest changes. I should maybe add the following: BIOS initializes the Quadro card that I am trying to isolate and so until the system hangs everything is output on this card (but it does not switch to the other at this point). I don't know why this card is initialized because in the BIOS settings I am forcing the GTX card (but it does say in the settings menu that it will try all other slots one after another, if the forced card can't be initialized (I have no idea why the GTX should not be initializing at it did work before).

My next step will be to remove the Quadro card from the system and see if it lets me successfully boot again.
 
Update:

After removing the card, the pc booted again.

I am now trying with an AMD card to avoid issues that might (or seem) to arise from using two NVIDIAs. Once I get it to work with the AMD card, I will come back to this thread and try again with the NVIDIA Quadro card.

Stay tuned...
 
... but I do still welcome any comments in the meantime as to what might have gone wrong

Update:
With the AMD card in the same slot the system boots up.

For some reason BIOS does not initialize the AMD card but the GTX card that I am forcing - so maybe that is the problem after all?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!