Machines lose network connectivity after host network restart

TodorPetkov

Active Member
Mar 21, 2020
59
9
28
Hi,
I have a 3 node cluster with latest PVE7 no-subscription version.
On each server two of the interfaces are in OVS bond, with 2 vlan interfaces on it, first is management (891), second is corosync (213). I have couple vms in third vlan 892 (it's not configured on the PVE itself, only on VMs)
When I restart the networking on the server, VMs on it lose network connectivity and I have to change VM network card model in order to restore connectivity. Rebooting the machine does not help, but shutdown and starting it works. I tried with VMs in 891 and 892 vlans, and behavior is the same.

What am I doing wrong? Here is the network config from one of the servers, other differ by IP address only.

Thanks in advance

Code:
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

auto eno1np0
iface eno1np0 inet manual
#10GBASE-T card

auto eno3
iface eno3 inet manual
#1000BASE-T card

auto eno4
iface eno4 inet manual
#1000BASE-T card

auto eno2np1
iface eno2np1 inet manual
#10GBASE-T card

auto vlan891
iface vlan891 inet static
        address 172.20.145.9/24
        gateway 172.20.145.254
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=891
#OVS management

auto vlan213
iface vlan213 inet static
        address 192.168.1.1/24
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=213
#OVS corosync

auto bond0
iface bond0 inet manual
        ovs_bonds eno1np0 eno2np1
        ovs_type OVSBond
        ovs_bridge vmbr0
        ovs_options bond_mode=active-backup other_config:bond_updelay=1000
#OVS Bond

auto vmbr0
iface vmbr0 inet manual
        ovs_type OVSBridge
        ovs_ports bond0 vlan213 vlan891
#OVS Bridge
 
Same issue!!! Please help. it's extremely scary to think that any changes I do on the host networking will stop my VMs network and they are in a PROD environment!
 
I think found the cause, but I have no idea how to fix it or if it's working like this by design. When restarting the network, the bridge get recreated, but interfaces of the VMs are no longer members. When I change network device model, it gets (re)added.

I am looking for way to add them via the cli, otherwise I have to go over X VMs via the UI and do many clicks..
 
  • Like
Reactions: webftpmaster
Found a quick and dirty solution....but it requires maintenance...so be careful!

Step 1:
Create a new script (bash) somewhere you can execute according to your user permissions of course.

example: /root/scripts/reload_vm_network

Step 2:
You need to gather a couple of info on your VMs. (THIS IS THE HARD PART IF YOU CHANGE YOUR VMs OPTIONS A LOT OR IF YOU CREATE LOTS OF VMs)

  • Get the list of your VMIDs

    Code:
    sudo qm list
    Note the VM ids you will need them

  • Get the network info of each VM

    Code:
    get config <vmid>
    Replace vmid by each vm id you got in the first command

    Note the line: netX: ....
    Example:

    Code:
    net0: e1000=DE:9D:F0:86:A8:DB,bridge=vmbr1,firewall=0

  • Repeat this step for each VM and take note of it.

Step 3:
Edit the script you created in step 1 and add all the information as for your VMs as follows:

nano /root/scripts/reload_vm_network

Code:
#!/bin/bash

vm101="e1000=DE:3E:6E:FE:C2:D8:F9,bridge=vmbr1"
vm102="e1000=2E:00:A6:1D:62:7F,bridge=vmbr1"

qm set 101 --net0 $vm101",link_down=1"
qm set 101 --net0 $vm101""

qm set 102 --net0 $vm102",link_down=1"
qm set 102 --net0 $vm102""

IMPORTANT
if you change the MAC address of your VMs, you need to edit the MAC address in the script...
If you change the bridge port you need to change it in the script ...
etc..

If you add a new VM (let's say 103 then you simply repeat step 2 for that VM id and insert in the script like this:

Code:
#!/bin/bash

vm101="e1000=DE:3E:6E:FE:C2:D8:F9,bridge=vmbr1"
vm102="e1000=2E:00:A6:1D:62:7F,bridge=vmbr1"
vm103="e1000=AE:40:B6:5C:CC:3D,bridge=vmbr1"

qm set 101 --net0 $vm101",link_down=1"
qm set 101 --net0 $vm101""

qm set 102 --net0 $vm102",link_down=1"
qm set 102 --net0 $vm102""

qm set 103 --net0 $vm103",link_down=1"
qm set 103 --net0 $vm103""

Step 4:
*** Reload your network config on your proxmox host ***

WARNING this will disrupt your network on the server so be prepared to access it via IPMI if this is on a dedicated host provider if the script somehow broke your connectivity


Code:
systemctl restart networking

Step 5 (Optional):
For convenience factor I added a couple of aliases in the bashrc file to make this easier when modifying the host potables and reloading the whole thing.

edit your bashrc file

Code:
nano /root/.bashrc

add the following at the end of the file

Code:
alias net.m='nano /etc/network/interfaces'
alias netall='systemctl restart networking'
alias net1='ifdown vmbr1; ifup vmbr1;'

Explanation:

net.m: this command will open the /etc/network/interfaces file ready for editing (to modify your potables for example)
netall: this will restart the entire networking of your host (check warning above...
net1: this will take down ONLY the vmbr1 bridge (in my case) which is attached to all your VMs as the default gateway...so your HOST or other bridges are not affected.

And now, when I need to add a new iptable route or a forward to one of my VMs under vmbr1 I simply do this in shell on my host:

Code:
net1

This will reload the modifications to that adapter (vmbr1) and will also "trigger" ALL VMs to reconnect their network interface. You might lose connectivity to the VMs for a few seconds until vmbr1 is up and running but that's to be expected.

Enhancement:
You can create a separate bridge for each VM and have a script for each one so when you change settings for one, you don't affect the others.

By all means this is a workaround...I believe that the Proxmox devs should look into this...or if someone has a better solution when.

I hope this helps.
 
Hi,
why do you need to restart networking ?

if you use ifupdown2, you can simply reload networking configuration, it'll only apply changes without shuting down bridge && vms tap interfaces attached to the bridge.

"ifreload -a", or in the gui "apply configuration" button.
 
Found a quick and dirty solution....but it requires maintenance...so be careful!

Step 1:
Create a new script (bash) somewhere you can execute according to your user permissions of course.

example: /root/scripts/reload_vm_network

Step 2:
You need to gather a couple of info on your VMs. (THIS IS THE HARD PART IF YOU CHANGE YOUR VMs OPTIONS A LOT OR IF YOU CREATE LOTS OF VMs)

  • Get the list of your VMIDs

    Code:
    sudo qm list
    Note the VM ids you will need them

  • Get the network info of each VM

    Code:
    get config <vmid>
    Replace vmid by each vm id you got in the first command

    Note the line: netX: ....
    Example:

    Code:
    net0: e1000=DE:9D:F0:86:A8:DB,bridge=vmbr1,firewall=0

  • Repeat this step for each VM and take note of it.
Thank you,

My "mods"


Bash:
#First get the stuff we are interested in:
qm list |grep running| awk '{print "qm config "$1" | grep ^net|sed -e '\''s/^/qm "$1" /'\''"}'|sh >> /tmp/t1
pct list |grep running| awk '{print "pct config "$1" | grep ^net|sed -e '\''s/^/pct "$1" /'\''"}'|sh >> /tmp/t1

#Now massage them into what we want:
# NB: I'm *NOT* using firewall option, so I'll use that to trigger as a "consisten"
#  between pct & qm - pct doesn't have a link_down= option
cat /tmp/t1| awk '{print $1" set "$2" -"$3" firewall=1,"$4}'> /tmp/tt1
cat /tmp/t1| awk '{print $1" set "$2" -"$3" "$4}'> /tmp/tt2


#DO run these from the CONSOLE!!!!

apt upgrade #Or what ever needs the "fun"

systemctl restart networking

bash < /tmp/tt1
bash < /tmp/tt2
 
I had the same issue. Up until now I've just been rebooting each host every time I do significant network changes (not too often, plus it's a homelab). I finally searched this up out of curiosity and ended up here. As a contribution I'll offer my take on the above scripting solutions.

These are just generating the commands to test. Take the 'echo's out to run for real.

One-liner
Bash:
while read vmid _ status _; do [[ "$status" == "running" ]] && while read net conf; do echo sudo qm set $vmid --${net:0:-1} ${conf}",link_down=1"; echo sudo qm set $vmid --${net:0:-1} ${conf}; done <<< $(sudo qm config $vmid | grep "^net" | grep -v "link_down=1"); done <<< $(sudo qm list | tail -n +2)

Script file / more readable version
Bash:
#!/bin/bash
while read vmid _ status _; do
  [[ "$status" == "running" ]] &&
  while read net conf; do
    echo sudo qm set $vmid --${net:0:-1} ${conf}",link_down=1"
    echo sudo qm set $vmid --${net:0:-1} ${conf}
  done <<< $(sudo qm config $vmid | grep "^net" | grep -v "link_down=1")
done <<< $(sudo qm list | tail -n +2)
 
  • Like
Reactions: m.limarenko
yeah, these scripts aren't "guaranteed", the beter/more guaranteed method: reboot the ProxMox whenever there are things changing on openvswitch, or other host network related interface changes
 
I had the same issue. Up until now I've just been rebooting each host every time I do significant network changes (not too often, plus it's a homelab). I finally searched this up out of curiosity and ended up here. As a contribution I'll offer my take on the above scripting solutions.

These are just generating the commands to test. Take the 'echo's out to run for real.

One-liner
Bash:
while read vmid _ status _; do [[ "$status" == "running" ]] && while read net conf; do echo sudo qm set $vmid --${net:0:-1} ${conf}",link_down=1"; echo sudo qm set $vmid --${net:0:-1} ${conf}; done <<< $(sudo qm config $vmid | grep "^net" | grep -v "link_down=1"); done <<< $(sudo qm list | tail -n +2)

Script file / more readable version
Bash:
#!/bin/bash
while read vmid _ status _; do
  [[ "$status" == "running" ]] &&
  while read net conf; do
    echo sudo qm set $vmid --${net:0:-1} ${conf}",link_down=1"
    echo sudo qm set $vmid --${net:0:-1} ${conf}
  done <<< $(sudo qm config $vmid | grep "^net" | grep -v "link_down=1")
done <<< $(sudo qm list | tail -n +2)

For me, bringing the interface down and up did nothing and the interface did not join the newly created bridge after host network restart.

So what I did to resolve this was basically switch the interface to another bridge temporarily (from vmbr1 to vmbr0) and then change it back to vmbr1 again:

Bash:
#!/bin/bash

[ "$IFACE" == "vmbr1" ] || exit 0

while read vmid _ status _; do
  [[ "$status" == "running" ]] &&
  while read net conf; do
    tempconf=`echo $conf | sed s/vmbr1/vmbr0/`
    sudo qm set $vmid --${net:0:-1} ${tempconf}",link_down=1"
    sudo qm set $vmid --${net:0:-1} ${conf}
  done <<< $(sudo qm config $vmid | grep "^net" | grep -v "link_down=1")
done <<< $(sudo qm list | tail -n +2)

Save this file in /etc/network/if-up.d/vm_net_restart and make it executable, so that every time your network stack gets restarted, your VM network is reattached to the bridges automatically.

Note 1: the code does nothing for vmbr0 itself and needs a fix if you really care about that bridge too.
Note 2: If you want to call it manually, do not put in the if-up.d folder, and remove the line
Code:
[ "$IFACE" == "vmbr1" ] || exit 0
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!