Unusual Networking Issues from GPU Install

GregoInc

New Member
Jul 25, 2022
13
1
3
Hello,

I have a rather unusual issue and so I thought I might provide some detail here and see what suggestions may be offered.

Let me explain... I have a Proxmox Server running a SuperMicro H11SSL-i MotherBoard with an EPYC CPU. The cards in the machine are the following:-

- Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Card or otherwise known as an Intel Corporation Ethernet Server Adapter X520-2
- Hyper M.2 x16 Card V2 with 4 x NVME drives

Up until now the server has been running perfectly, with literally no issues. I recently tried installing a GPU that I could pass-through to a VM, and when the GPU card (Quadro P600) was installed the Proxmox server appeared to loose network connectivity.

I figured it could be the GPU I was using, so I obtained an Nvidia Tesla P4 GPU and surprisingly the Proxmox server behaved the same when the GPU was installed... a loss of network connectivity. When either GPU is removed the network connectivity in Proxmox is restored.

Having read other posts in this forum I performed the following commands to gather data... lspci -vvv and journalctl -b to gather data on the possible issue. Something that leapt out to me in the lspci -vvv data was the entries for the 82599ES 10-Gigabit SFI/SFP+ Network Card, see below.

You may notice in the attached lspci-vvv-log.txt all the SiSiSiSiSiSiSiSiSiSiSiSiSiSiSiSi in the Intel Network 10Gb Card entry. I am unsure if this data is normal?

I also noted in the attached journalctl-b.txt file, there appear to be a number of error entries, which I believe might relate to the lack of network connectivity? Rather than posting all the log entries, perhaps those in here more knowledgeable than me could suggest a place where I could go and look for further diagnostic data.

As I said, prior to installing the GPU the Proxmox server has been excellent, so no idea why adding a GPU is causing this unusual behaviour? Appreciate any guidance/suggestions on steps I could take to diagnose and resolve.[/size]
 

Attachments

  • journalctl-b-NoGPU.txt
    359.4 KB · Views: 2
  • journalctl-b.txt
    204.7 KB · Views: 0
  • lspci-vvv-log.txt
    284.2 KB · Views: 2
Last edited:
Up until now the server has been running perfectly, with literally no issues. I recently tried installing a GPU that I could pass-through to a VM, and when the GPU card (Quadro P600) was installed the Proxmox server appeared to loose network connectivity.
This is very common because the name of the network devices depends on the PCI ID and that can change when you add or remove PCI(e) devices. Run ip a to find the new name and adjust /etc/network/interfaces accordingly. Lots of threads about this with more detailed information on this forum (now that you know what to search for).
 
This is very common because the name of the network devices depends on the PCI ID and that can change when you add or remove PCI(e) devices. Run ip a to find the new name and adjust /etc/network/interfaces accordingly. Lots of threads about this with more detailed information on this forum (now that you know what to search for).

Thank you for the advice, I really appreciate it. I will insert the GPU and then see what changes with the names. Thanks again.
 
Hello
Similar issue here; lost connectivity after instaling a Tesla P40.
When running ip a, I see that I loose the vmbr0 network device, but no other appears.
Did you solve the issue? Can you give me any link or hint?
 
That's weird. Your current situation does not look like your configuration. Somehow the network configuration is not starting automatically. This is a very different issue from the OP.
What is the output of systemctl status networking.service? Any errors in journalctl -b 0 that stand out? And what did you change except adding the GPU add-in card?
 
And what did you change except adding the GPU add-in card?

Nothing else; installed the GPU and after booting loosed vbmr0.

Today I made a proxmox fresh install; the boot went ok and I could get to the frontend.
Installed the updates, rebooted and loose vbmr0 again
 
Today I made a proxmox fresh install; the boot went ok and I could get to the frontend.
Installed the updates, rebooted and loose vbmr0 again
Did anything change compared to the photo's you showed? I'm assuming not.
What is the output of systemctl status ifupdown2-pre.service (which appears to make networking.service fail because it depends on it)?
 
Did anything change compared to the photo's you showed? I'm assuming not.
No. The behaviour was the same, loosing vmbr0 after boot.

The output of ifup:
ifup.GIF
 
A lot of thanks for you help and patience.

Disabling Noveau was one of the todo's; but I expected I could do it later - after the nvidia drives.
I'll try it later today and post the result here.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!