[SOLVED] can't restart or reload networking.service (error: cannot add dependency job)

scyto

Well-Known Member
Aug 8, 2023
546
117
48
As some are aware my networking was messy the last cfew days (through entirely my own contributions).

  • I moved back to the proxmox current repo version of ifupown2
  • I removed all SDN configuration and hit apply, all my VMs continue to work
  • the network seems stable
I have this error below on all nodes and it blocks systemctl restart and reload commands on networking.service, it doesn't seem to cause any issue and a reboot fixes. I was wondering if there was a way to fix without rebooting?

Code:
Apr 24 10:41:16 pve1 systemd[1]: networking.service: Cannot add dependency job, ignoring: Unit networking.service failed to load properly, please adjust/correct and reload service manager: Device or resource busy
Apr 24 10:41:22 pve1 systemd[1]: networking.service: Cannot add dependency job, ignoring: Unit networking.service failed to load properly, please adjust/correct and reload service manager: Device or resource busy
Apr 24 11:11:38 pve1 systemd[1]: networking.service: Cannot add dependency job, ignoring: Unit networking.service failed to load properly, please adjust/correct and reload service manager: Device or resource busy
[root@pve1 11:12:30]$ systemctl reload networking.service
Failed to reload networking.service: Unit networking.service failed to load properly, please adjust/correct and reload service manager: Device or resource busy
See system logs and 'systemctl status networking.service' for details.
[root@pve1 11:12:38]$ systemctl restart networking.service
Failed to restart networking.service: Unit networking.service failed to load properly, please adjust/correct and reload service manager: Device or resource busy
See system logs and 'systemctl status networking.service' for details.

if the answer is 'shrug' i can just reboot al 3 nodes, but if anyones more analysis let me know (it would seem essential the service is never left in this state)
 
Last edited:
Oftentimes caused by issues with ifupdown2 reloading the network configuration. You can check the debug output of reloading via:

Code:
ifreload -avd

Also, as the error message says you might already find the issue in the systemlogs via journalctl.
 
Oftentimes caused by issues with ifupdown2 reloading the network configuration. You can check the debug output of reloading via:

Code:
ifreload -avd

Also, as the error message says you might already find the issue in the systemlogs via journalctl.
thanks output attached, notes:
  • pve1 is the node where networking.service won't reload
  • pve3 is the node i rebooted and networking service does reload
  • i note both nodes are still tying to do bgp things, i checked frr.conf and despite my SDN config being empty in the ui it seem it never cleanedup the bgp settings from within frr.conf... i can remove by hand so no biggie i suspect based on this i may have copied frr.conf > frr.conf.local and caused this, so likely unrelated note
this is the point in the joural on pve1 where the error was encountered

Code:
Apr 25 08:56:17 pve1 root[3584322]: Attempting to restart frr.service for lo
Apr 25 08:56:17 pve1 systemd[1]: networking.service: Cannot add dependency job, ignoring: Unit networking.service failed to load properly, please adjust/correct and reload service manager: Device or resource busy

there is no corresponding error after the lo restart (which is a script i wrote that calls frr restart when if-up is processed because lo was changed). The script is the same on both nodes.

I am ok to just reboot the two remaining nodes with the restart issue if thats easier, I will lave the repro up until you tell me to reboot :-)
 

Attachments

Last edited:
I am ok to just reboot the two remaining nodes with the restart issue if thats easier, I will lave the repro up until you tell me to reboot :-)

Might be easier - judging from your logs there is still some leftover configuration in frr.conf . Is it also possible that you copied configurations around and forgot to replace the router-id or hostname?

The following log messages in your journal seem to indicate that there are duplicate router IDs:

Code:
Apr 25 08:55:55 pve1 bgpd[3582622]: [MVZKX-EG443][EC 33554452] bgp_process_packet: BGP OPEN receipt failed for peer: 10.0.0.82
Apr 25 08:55:57 pve1 bgpd[3582622]: [HZN6M-XRM1G] %NOTIFICATION: sent to neighbor 10.0.0.83 6/7 (Cease/Connection Collision Resolution) 0 bytes

Also seems like there is a broken FRR configuration still existing?

Code:
Apr 25 08:56:17 pve1 frrinit.sh[3584454]: [3584454|staticd] Configuration file[/etc/frr/frr.conf] processing failure: 2
Apr 25 08:56:17 pve1 frrinit.sh[3584444]: The route-map 'MAP_VTEP_IN' does not exist.
Apr 25 08:56:17 pve1 frrinit.sh[3584444]: The route-map 'MAP_VTEP_OUT' does not exist.
Apr 25 08:56:17 pve1 frrinit.sh[3584444]: The route-map 'MAP_VTEP_IN' does not exist.
Apr 25 08:56:17 pve1 frrinit.sh[3584444]: The route-map 'MAP_VTEP_OUT' does not exist.
[...]
Apr 25 08:56:17 pve1 frrinit.sh[3584451]: line 13: % Unknown command[4]: exit-vrf
Apr 25 08:56:17 pve1 frrinit.sh[3584454]: line 13: % Unknown command[4]: exit-vrf
Apr 25 08:56:17 pve1 frrinit.sh[3584437]: line 13: % Unknown command[4]: exit-vrf
 
Last edited:
  • Like
Reactions: scyto
Might be easier - judging from your logs there is still some leftover configuration in frr.conf . Is it also possible that you copied configurations around and forgot to replace the router-id or hostname?

The following log messages in your journal seem to indicate that there are duplicate router IDs:

Code:
Apr 25 08:55:55 pve1 bgpd[3582622]: [MVZKX-EG443][EC 33554452] bgp_process_packet: BGP OPEN receipt failed for peer: 10.0.0.82
Apr 25 08:55:57 pve1 bgpd[3582622]: [HZN6M-XRM1G] %NOTIFICATION: sent to neighbor 10.0.0.83 6/7 (Cease/Connection Collision Resolution) 0 bytes

Also seems like there is a broken FRR configuration still existing?

Code:
Apr 25 08:56:17 pve1 frrinit.sh[3584454]: [3584454|staticd] Configuration file[/etc/frr/frr.conf] processing failure: 2
Apr 25 08:56:17 pve1 frrinit.sh[3584444]: The route-map 'MAP_VTEP_IN' does not exist.
Apr 25 08:56:17 pve1 frrinit.sh[3584444]: The route-map 'MAP_VTEP_OUT' does not exist.
Apr 25 08:56:17 pve1 frrinit.sh[3584444]: The route-map 'MAP_VTEP_IN' does not exist.
Apr 25 08:56:17 pve1 frrinit.sh[3584444]: The route-map 'MAP_VTEP_OUT' does not exist.
[...]
Apr 25 08:56:17 pve1 frrinit.sh[3584451]: line 13: % Unknown command[4]: exit-vrf
Apr 25 08:56:17 pve1 frrinit.sh[3584454]: line 13: % Unknown command[4]: exit-vrf
Apr 25 08:56:17 pve1 frrinit.sh[3584437]: line 13: % Unknown command[4]: exit-vrf
thanks, i had to reboot all nodes for other reasons, its entirely possible i made a cut and paste mistake , also IIRC one of the guides said I was playing with actually had me create a bridge on each node with the same IP - this may have been the root of mistake as my existing openfabric will have just picked that up and propagated it....

either way i have a working setup now, will post soon, its mostly written up - i would like to find a way do it with SDN at some future juncture when SDN supports dual stack.....