[SOLVED] Adding new PCIe card prevents boot.

nertskull

New Member
Sep 13, 2020
12
7
3
124
I have a working proxmox system for the past year. But I wanted to try installing a new card (tv tuner) to my system to pass through to a new machine. When I add the card and boot the system, everything goes normal right up to when it should get to the login screen, at which point the monitor goes black with no signal. I can't ssh in or do anything to see the machine.

If I take the card out and boot again, the system boots as normal. I tried looking at the logs, and the only thing I could see was this:

Code:
Jun  7 08:56:29 orange systemd-udevd[3624]: Using default interface naming scheme 'v240'.                     
Jun  7 08:56:29 orange systemd-udevd[3624]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jun  7 08:56:29 orange systemd-udevd[3624]: Could not generate persistent MAC address for vmbr0: No such file or directory
Jun  7 08:56:29 orange networking[3619]: error: vmbr0: bridge port enp5s0 does not exist                     
Jun  7 08:56:29 orange systemd-udevd[3408]: Using default interface naming scheme 'v240'.
Jun  7 08:56:29 orange systemd-udevd[3408]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jun  7 08:56:29 orange systemd-udevd[3408]: Could not generate persistent MAC address for vmbr1: No such file or directory
Jun  7 08:56:29 orange networking[3619]: error: vmbr1: bridge port enp10s0f3 does not exist                                                                             
Jun  7 08:56:29 orange systemd-udevd[3402]: Using default interface naming scheme 'v240'.             
Jun  7 08:56:29 orange systemd-udevd[3402]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jun  7 08:56:29 orange systemd-udevd[3402]: Could not generate persistent MAC address for vmbr2: No such file or directory
Jun  7 08:56:29 orange systemd[1]: Starting LVM event activation on device 230:131...                                           
Jun  7 08:56:29 orange systemd[1]: Started LVM event activation on device 230:131.

I have a 4 port NIC installed, a graphics card, and a HBA card. Motherboard is asus prime x570-pro. This is my proxmox version
Code:
proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3

It seems to boot fine right up until login time. Is it possible adding that second card is knocking my NIC off-line so it no longer see's my vmbr0? I don't see why that would lock the system up though. Also, I tried this with 2 tv tuner cards I have, so it's not the card.

I can't access the system with the card plugged in, so I'm not sure how else to troubleshoot. Any suggestions?
 
  • Like
Reactions: majorgear
The new PCI card most likely changes the ordering of other PCI devices. One of them is the NIC.

Run ip link to get a list of the network cards currently detected. The enp10s0f3 will not show up anymore. It will have a different name.
Change that in the /etc/network/interfaces file where the old name occurs, and reboot the node or try to run ifreload -a
 
Oh that makes sense, how do I do that though? Because when I plug the card in, I get a black screen and can't access the machine? So I can't change things.
 
  • Like
Reactions: majorgear
Blank screen when connecting a display+keyboard directly to your host?
In that case the renamed NIC isn't the main problem.

Maybe you should blacklist the tuner drivers in case you want to PCI passthrough it a VM anyway.
 
Yeah, black screen w/ display/keyboard directly to the host. I can't do anything. But I've seen no other errors in the log other than as above. It goes through the entire boot sequence though, right up to the point it would normally drop to a login prompt. Then nothing.
 
  • Like
Reactions: majorgear
Make sure not to automatically start VMs and all disable PCI passthrough using pressing e in the boot menu and use amd_iommu=off. Various PCI IDs may change and, because devices in the same IOMMU group cannot be shared, even devices that were not passed through might be unbound from the Proxmox host by vfio-pci. If the network device(s) come after the new PCI ID assigned to the TV tuner then their names are shifted by 1. Personally, I have no good experiences with PCIe TV tuners but hopefully yours does work with passthrough.
 
Make sure not to automatically start VMs and all disable PCI passthrough using pressing e in the boot menu and use amd_iommu=off. Various PCI IDs may change and, because devices in the same IOMMU group cannot be shared, even devices that were not passed through might be unbound from the Proxmox host by vfio-pci. If the network device(s) come after the new PCI ID assigned to the TV tuner then their names are shifted by 1. Personally, I have no good experiences with PCIe TV tuners but hopefully yours does work with passthrough.
This was it. I had one of the VMs starting at boot with a passthrough NIC card. But with the changed card id's once the new PCIe card was added, that autoboot machine then started took over the graphics card. So I couldn't see things locally, and without the nic, I couldn't ssh in.

Anyway, turned off the autoboot, and renamed all the passthrough cards, and updated interfaces, and things work now.

Thanks so much!
 
  • Like
Reactions: majorgear and aaron
This was it. I had one of the VMs starting at boot with a passthrough NIC card. But with the changed card id's once the new PCIe card was added, that autoboot machine then started took over the graphics card. So I couldn't see things locally, and without the nic, I couldn't ssh in.

Anyway, turned off the autoboot, and renamed all the passthrough cards, and updated interfaces, and things work now.

Thanks so much!
Hey nertscull, happy to hear you were able to resolve your issue. The reason I am posting here is, I have a very similar issue myself. For me it's a third low profile gpu which I am trying to set up as the host gpu. Once this card is in, Proxmox becomes inaccessible. Can you please explain how you have done the renaming / remapping of your PCI devices?
 
  • Like
Reactions: majorgear
Add display and keyboard to your server and login (or use webKVM in case you got a BMC). Then run ip addr to see the names of your NICs.
Then replace the old names of your NICs with the new name of your NICs using nano /etc/network/interfaces and then do a systemctl restart networking so that changes take into effect.
 
  • Like
Reactions: pm41aif
Add display and keyboard to your server and login (or use webKVM in case you got a BMC). Then run ip addr to see the names of your NICs.
Then replace the old names of your NICs with the new name of your NICs using nano /etc/network/interfaces and then do a systemctl restart networking so that changes take into effect.
Thank you Dunuin. I will try it out.
 
Add display and keyboard to your server and login (or use webKVM in case you got a BMC). Then run ip addr to see the names of your NICs.
Then replace the old names of your NICs with the new name of your NICs using nano /etc/network/interfaces and then do a systemctl restart networking so that changes take into effect.
Hello Dunuin,
I have followed your instructions, and I got the following to look at: https://pond-country-dbb.notion.sit...etwork-setup-a510a26a4bf545d58b4b06a853b9deb9

But to be honest, I am not too sure what to change, to what to change, and why to change. Can you please elaborate on it in detail?

Thank you in advance.
 
And what is the output of ip addr when the PVE host becomes unaccessible when adding the GPU?
 
And what is the output of ip addr when the PVE host becomes unaccessible when adding the GPU?
I apologies for my lack of knowledge with Linux. I didn't know how to check ip addr while the host is inaccessible. So I googled how to trouble shoot previous boot cycles. journalctl -o short-precise -k -b -1... This one was the best I could find. Within that print out I searched for eth1. This was my finding:

0000:3a:00.0 enp58s0 without 3rd card
0000:3b:00.0 enp59s0 with 3rd card

I went back into /etc/network/interfaces and changed enp58s0 to enp59s0 (https://pond-country-dbb.notion.site/enp58s0-enp59s0-347a23290bf0479b976d50e00b53ed22). With the high hopes that I'm doing the right thing. Well, it didn't work.

The host starts (without graphics output), the windows vm which was set to 'auto start' starts successfully this time. The passthrough on it works. But, it doesn't have any internet. No ssh to the host. PVE not accessible.

I have tried booting with the other nic connected (desperate). No success.

Now I don't know how to get access to PVE (really desperate) to change back to previous settings.

I'm out of ideas for today. Any suggestions are very welcome.

And thank you in advance
 
First, always disable "auto start" of VMs with PCI passthrough. Especially when changing hardware. Otherwise you can lock yourself out when for example the autostarting VM is crashing the whole host when passing through a wrong PCIe device.

When SSH+webUI isn't working you should usually still be able to attach a keyboard and display to your server. Proxmox boots into console, so you just need to type in your root password to login. Then you can type in your commands like ip addr directly.
 
Last edited:
First, always disable "auto start" of VMs with PCI passthrough. Especially when changing hardware. Otherwise you can lock yourself out when for example the autostarting VM is crashing the whole host when passing through a wrong PCIe device.

When SSH+webUI isn't working you should usually still be able to attach a keyboard and display to your server. Proxmox boots into console, so you just need to type in your root password to login. Then you can type in your commands like ip addr directly.
Thank you for your patience. After fair bit of trial and error I have managed to get into the console, made the first gpu available for PVE and got this print out for ip addr.
 
  • Like
Reactions: majorgear
So try it with...
Code:
auto lo
iface lo inet loopback

iface enp57s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.0.49/24
        gateway 192.168.0.1
        bridge-ports enp57s0
        bridge-stp off
        bridge-fd 0
...then a systemctl restart networking and check if the webUI is working again.

If it still doesn't work either plug your ethernet cable in the other NIC OR try the above again with "enp59s0" at both places where its "enp57s0".
 
  • Like
Reactions: majorgear
So try it with...
Code:
auto lo
iface lo inet loopback

iface enp57s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.0.49/24
        gateway 192.168.0.1
        bridge-ports enp57s0
        bridge-stp off
        bridge-fd 0
...then a systemctl restart networking and check if the webUI is working again.

If it still doesn't work either plug your ethernet cable in the other NIC OR try the above again with "enp59s0" at both places where its "enp57s0".
Thank you kindly Dunuin. This time it worked. I now know what to do if I lock myself out and the network is down. Very much appreciate your help.
 
  • Like
Reactions: majorgear

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!