[SOLVED] PVE 8.4 to 9 - networking.service never starts

Dec 18, 2024
6
1
3
Hello! I attempted to do an in-place upgrade from PVE 8.4 to PVE 9, and have run into a strange issue where the networking service never manages to start during startup. As far as I can tell, no one has had this issue going from 8 to 9 so far. I have a three system cluster, and followed the guide on the wiki step-by-step (and simultaneously) for two of them, leaving the third alone until I knew the process worked. I am experiencing the same issue on both systems, where the job to start networking.service hangs indefinitely.
1754595322467.png
These are non-production no-subscription systems, so I can blow them up and start over if needed, but I'd like to get these working before I consider moving any other systems I manage from 8 to 9.

I did find instances of a similar issue in the past for people that upgraded from 7 to 8, but the cause there was old NTP packages that needed to be removed, and had been deprecated in PVE 8. I did check the system I left on 8.4 for the packages and script that caused the issue last time, but they are not present, so I can't imagine them being on the other two systems, or that being the cause. I set up this cluster last month and all three systems were fresh installs of 8.4 using the latest ISO that was available on the site at the time.

I tried rebooting one of the two failing systems a few times, but the issue persists.

I have full access to these servers and can take any troubleshooting steps necessary - I just honestly don't know what I should be checking for.

It might also be worth noting that I tried going into the Rescue Boot option from the PVE 9 ISO and got an error stating it couldn't find my boot zpool or find the boot disk automatically. I'm not sure why.

1754595902315.png


Any help or guidance is very much appreciated!!
 
Last edited:
I am also experiencing this issue on a Proliant Gen8 server using 802.3ad link aggregation after following the 8 to 9 upgrade. I'm also seeing that the physical link is up for both members of the bonded interface. I'll try disabling aggregation on my switch and report back.

1754679474039.png
 
To both:

How does your /etc/network/interfaces look like? Do you have any post-up scripts in there? Can you post the contents of the file?
Any scripts in /etc/network/if-up.d ?
 
I think you're on to something. I have an openfabric mesh network for my ceph cluster and it does define a post-up command. However I do not think there is anything nonstandard about my bond0 interface definition.

Code:
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

auto eno1
iface eno1 inet manual

auto eno2
iface eno2 inet manual

auto bond0
iface bond0 inet manual
        bond-slaves eno1 eno2
        bond-miimon 100
        bond-mode 802.3ad

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.62/24
        gateway 192.168.1.1
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
        post-up /usr/bin/systemctl restart frr.service

source /etc/network/interfaces.d/*

auto ens1
iface ens1 inet static
        mtu 9000

auto ens1d1
iface ens1d1 inet static
        mtu 9000

Code:
# ls /etc/network/if-up.d
bridgevlan  bridgevlanport  chrony  ethtool  mtu  postfix

Nothing that wasn't placed there by the system.
 
Last edited:
Code:
post-up /usr/bin/systemctl restart frr.service

This line is most likely the culprit, because of changes to the frr service. Removing this line should fix it (although I cannot say if your OpenFabric configuration will work after that, I'll have to take a closer look on Monday and reproduce it).
 
  • Like
Reactions: smobbs
Thank you for your help @shanreich! After deleting the post-up as well as
Code:
auto ens1
iface ens1 inet static
        mtu 9000

auto ens1d1
iface ens1d1 inet static
        mtu 9000
I was able to boot the system and the fabric seems to work normally. One reason I'm playing around with pve9 is for the new SDN features ironically enough :).
 
Ahh crap. This cluster is also using an open fabric mesh with the same command at the bottom. I'll delete the post-up and see if everything comes up. Thank you both!
 
Can confirm this post-up line was the source of the issue:
Code:
post-up /usr/bin/systemctl restart frr.service

Had to boot into install debug mode on a PVE 8.4 ISO (debug mode didn't work on the 9.0 ISO, rescue boot didn't want to work on either ISO) and manually import my pool using the instructions here: https://forum.proxmox.com/threads/proxmox-rescue-disk-trouble.127585/#post-557888

After commenting out the post-up line and rebooting, everything came up as expected! OpenFabric configuration still works as well, and came up automatically.

These nodes were set up using the old full mesh guide for PVE8, though I see the page has since been updated for PVE9
https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server
 
  • Like
Reactions: shanreich
These nodes were set up using the old full mesh guide for PVE8, though I see the page has since been updated for PVE9
Yes, I'd be happy to get some feedback on that feature as soon as you get the time to migrate / try it on a new cluster!
 
Yes, I'd be happy to get some feedback on that feature as soon as you get the time to migrate / try it on a new cluster!
Hey Stefan! I had set this cluster up with two mesh networks: one for Ceph and one for corosync. When I originally did it, I used OpenFabric method for the Ceph network and the RSTP loop method for the corosync network since I couldn't figure out how to get a second OpenFabric network running on my systems.

I migrated both networks this afternoon to the new SDN fabrics, and I have to say I am very pleased with the new process! The old (now manual) method wasn't necessarily difficult to set up, but it was a little tedious. The new wizard is certainly more foolproof, and makes setting up new OpenFabric networks a breeze. Migration of the existing OpenFabric Ceph network was easy and probably took about 5 minutes.

Migration of the corosync network gave me a bit more trouble, but I'm not exactly surprised, given the nature of what it does. To migrate the corosync network, I commented out all the lines in my network/interfaces related to the Open vSwitch setup previously created using the RSTP Loop Setup section of the mesh network document. I then created the network in the SDN > Fabrics section and defined my nodes and their respective interfaces. The hope was that I'd hit apply in the main SDN dashboard, and it'd sync my fabrics configuration with the other nodes first and then reload networking on all of them. It worked as expected on 2 of my 3 nodes, but the configuration didn't take on the first node.

After a bit of troubleshooting and rebooting, I checked /etc/frr/frr.conf on the three nodes and found that my 1st node didn't have the new network defined in it. I copied over the differences from the frr.conf used by nodes 2 and 3, adjusting node-specific values as necessary. After saving, I restarted the frr service but no dice. Eventually I realized my first node didn't have a dummy interface associated with the new network like the other two did. I thought I was going to have to replicate the dummy interface manually, but thankfully, I checked the task log on the node I initially pushed the changes from and found the command it tried and failed to push to my first node.
Code:
pvesh set /nodes/NODE-NAME-HERE1/network --regenerate-frr 1
After running that command manually on the first node, it regenerated the config and created the dummy interface. After a couple of seconds, everything came back up and all three nodes were communicating again. I rebooted a couple of times to make sure the changes persisted, and everything looks good and stable now!

I'm certain it's not best practice or recommended whatsoever to migrate the corosync network in that manner, but these are non-prod systems and I wanted to see if I'd be able to get away with it. At the very least, it's good to know that it can work should no other option be available. Just might have to manually run the regenerate command on nodes that don't come up. I'd also guess this is very much a "your-mileage-may-vary" sort of thing that gets riskier the larger your cluster is.

Now that everything is up and running, I'm very happy with the new SDN Fabrics setup and management experience! It's great to see this officially supported and integrated into the web GUI, and I already know it'll save a bunch of time on future cluster setups.
 
Thanks for your feedback!

Does anything show up in the journal? Could you post the task logs maybe, so I can take a look at what possibly went wrong? The dummy interface not being there indicates some kind of problem with applying the ifupdown2 configuration.
 
Hello, Stefan!

I believe I see what happened, but I'll also attach my logs so you can review them for yourself. I haven't redacted them, as the only real information in them is my fabric subnets and the devices' generic hostnames.

As stated previously, I have two networks: my Ceph network (cephnet) and my corosync network (coronet). My nodes share the same hostname and are numbered 1 through 3. I was making these changes from the web GUI of node 2. The creation of cephnet worked as expected, but the creation of coronet failed to propagate to node 1, and there was a long delay (20 minutes) between the SRV networking reload tasks on the node 3 (the first to execute) and node 2. Node 1 never had a networking reload task for coronet.

Looking at the reloadnetworkall log for each network, I noticed that it executed both in the same order: it started with node 3, then node 2, and ended with node 1. Here's the log from when I did this for cephnet:
Code:
abi-pxmx-vmh3: reloading network config
info: executing /usr/bin/dpkg -l ifupdown2
abi-pxmx-vmh2: reloading network config
info: executing /usr/bin/dpkg -l ifupdown2
abi-pxmx-vmh1: reloading network config
info: executing /usr/bin/dpkg -l ifupdown2
TASK OK
It seems it did these sequentially, waiting for the task to finish on the other nodes before executing on the next.1755007129412.png

In the coronet log, it ran things in the same order, but got hung waiting for node 3 until it was interrupted by me rebooting the nodes 20 minutes later. Once that got interrupted, it ran the task on node 2 but was never able to run the task on node 1 since the change disrupted the corosync network between nodes 1 and 2.
Code:
abi-pxmx-vmh3: reloading network config
command 'pvesh set /nodes/abi-pxmx-vmh3/network --regenerate-frr 1' failed: received interrupt
abi-pxmx-vmh2: reloading network config
info: executing /usr/bin/dpkg -l ifupdown2
abi-pxmx-vmh1: reloading network config

That was also the log where I noticed the command it had tried to run. After running the command on node 1, everything came up as expected. Part of me wonders if it would've worked had I done everything from node 1, since it would've run on node 3, interrupted from the reboot, then tried running on node 2, probably also interrupted from the reboot, and then finally run on node 1.
 

Attachments

Code:
post-up /usr/bin/systemctl restart frr.service

This line is most likely the culprit, because of changes to the frr service. Removing this line should fix it (although I cannot say if your OpenFabric configuration will work after that, I'll have to take a closer look on Monday and reproduce it).
I wonder if this could be included in the pve8to9 check. It would've saved me a day. I had to recover the server using IPMI. The server also boots from ZFS, so the rescue mode didn't work. I used Debian's live-build to make a boot ISO with ZFS in it. Then, after fixing the interfaces file, you have to reset the ZFS mountpoints to the normal Proxmox places. I know it's documented here, but it'd be simple to add to the checklist script.
https://pve.proxmox.com/wiki/Upgrade_from_8_to_9#Existing_Ceph_Full_Mesh_Setups_fail_to_boot