[SOLVED] Changing Simple Zone vnet subnet settings broke a container.

scyto

Well-Known Member
Aug 8, 2023
569
133
53
I thought i would try and start simple - so i created a simple zone and vnet.

The container i attached to the zone came up and was issue an IP address from IPAM. I could ping both itslef and the gateway (host) address.

I realised i had set the subnet wrong.

so I added a new subnet with the right settings, tried to start the container and it failed.
so i removed the old subnet (so there is only one subnet on the vnet) and the container also failed
attaching the container to vmbr0 works

When i applied the SDN settings i see:
Code:
Removed "/etc/systemd/system/multi-user.target.wants/dnsmasq@node3.service".
Created symlink /etc/systemd/system/multi-user.target.wants/dnsmasq@node3.service -> /lib/systemd/system/dnsmasq@.service.
Job for dnsmasq@node3.service failed because the control process exited with error code.
See "systemctl status dnsmasq@node3.service" and "journalctl -xeu dnsmasq@node3.service" for details.
Could not run after_configure for DHCP server node3 command 'systemctl restart dnsmasq@node3' failed: exit code 1

TASK OK
this only happend one the failure happend, the initial vnet creation was fine.

When i start the container and it fails:

Code:
run_buffer: 571 Script exited with status 11
lxc_init: 845 Failed to run lxc.hook.pre-start for container "102"
__lxc_start: 2034 Failed to initialize container "102"
TASK ERROR: startup for container '102' failed

this is my SDN config
Code:
root@pve3 13:35:52 /etc/frr #  cat /etc/pve/sdn/*.cfg
subnet: node3-10.0.83.0-24
        vnet test
        dhcp-range start-address=10.0.83.2,end-address=10.0.83.20
        gateway 10.0.83.1
        snat 1

vnet: test
        zone node3

simple: node3
        dhcp dnsmasq
        ipam pve
        nodes pve3

and this is my frr.conf - just incase having that subnet defined twice is the issue.....

Code:
frr version 8.5.2
frr defaults datacenter
hostname pve3
log syslog informational
service integrated-vtysh-config
!
interface en05
 ip router openfabric 1
 ipv6 router openfabric 1
!
interface en06
 ip router openfabric 1
 ipv6 router openfabric 1
!
interface lo
 ip address 10.0.0.83/32
 ip address 10.0.83.83/24
 ip router openfabric 1
 ipv6 address fc00::83/128
 ipv6 address fd00:0:0:83::83/64
 ipv6 router openfabric 1
 openfabric passive
!
router openfabric 1
 net 49.0000.0000.0003.00
 spf-delay 10
 spf-holdtime 20
 spf-maximum-wait 50
 lsp-gen-interval 5
exit
!
 
Last edited:
the scenario i am trying to acheieve is the simplest way to get a VM to be able to talk to my ceph public network (which is isolated on the thunderbolt mesh network)
 
Last edited:
this seems to be the issue - i don't know what it means by thr address is in use as SDN is the only thing that every configured that, so if its in use because of SDN why does this fail?

Code:
░░ Subject: A start job for unit dnsmasq@node3.service has begun execution
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit dnsmasq@node3.service has begun execution.
░░
░░ The job identifier is 5143.
Apr 25 13:49:58 pve3 systemd-helper[128963]: dnsmasq: failed to create listening socket for 10.0.83.1: Address already in use
Apr 25 13:49:58 pve3 dnsmasq[128963]: failed to create listening socket for 10.0.83.1: Address already in use
Apr 25 13:49:58 pve3 dnsmasq[128963]: FAILED to start up
Apr 25 13:49:58 pve3 systemd[1]: dnsmasq@node3.service: Control process exited, code=exited, status=2/INVALIDARGUMENT
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ An ExecStart= process belonging to unit dnsmasq@node3.service has exited.
░░
░░ The process' exit code is 'exited' and its exit status is 2.
Apr 25 13:49:58 pve3 systemd[1]: dnsmasq@node3.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ The unit dnsmasq@node3.service has entered the 'failed' state with result 'exit-code'.
Apr 25 13:49:58 pve3 systemd[1]: Failed to start dnsmasq@node3.service - dnsmasq (node3) - A lightweight DHCP and caching DNS server.
░░ Subject: A start job for unit dnsmasq@node3.service has failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit dnsmasq@node3.service has finished with a failure.
░░
░░ The job identifier is 5143 and the job result is failed.
 
  1. I stopped the dnsmasq service
  2. i clicked apply in SDN
  3. after that container starts
this seems like a weird bug
 
  • Like
Reactions: sgp
the pinging issue here at the end is defintely caused by conflict with frr.conf - as for the orgiginal errors, not sure, they didn't seem to come back
(i created a 10.0.86.0/24 subnet and that worked).
 
I have a similar problem... "failed to create listening socket for..." , "Could not run after_configure for DHCP server"... dnsmasq report after applying SDN config.

When you run SDN for the first time, the system reports that it requires the dnsmasq pkg...
But it does not say that this is just In order to use the automatic DHCP feature
so you install dnsmasq
and then
Bash:
systemctl disable --now dnsmasq

Ref: Setup Simple Zone With SNAT and DHCP
https://pve.proxmox.com/wiki/Setup_Simple_Zone_With_SNAT_and_DHCP