LXC networking & problem with systemd-networkd-wait-online.service

Dunuin

Distinguished Member
Jun 30, 2020
14,796
4,754
258
Germany
Hi,

I've got a Debian 11 privileged LXC with PBS2 I did upgrade to Debian 12 + PBS3 following the PBS2-to-3 documentation.
Looks like it is basically working but some things confuse me:

1.) there is no sshd.service found but I can still ssh into that LXC. Before the upgrade there was a sshd.service but this was always in "dead" state. A LXC doesn't run its own SSH server but uses the one of the host or how do I ssh into the LXC? If there is no sshd.service running, how should I harden my security? I guess my LXCs /etc/ssh/sshd_conf then also gets ignored?

2.) Where is PVE storing the network configs I set up in the PVE webUI for that LXC? I thought PVE would write them to /etc/network/interfaces but it looks like this isn't the case, as my configs in /etc/network/interfaces doesn't match what's setup in the webUI. In the webUI my NICs are set to MTU 9000 but there is no "mtu 9000" line in the "/etc/network/interfaces". ip addr was also showing a MTU of 1500. Only after I edited my /etc/network/interfaces and manually added the "mtu 9000" lines to the interfaces ip addr was showing that a MTU of 9000 was used.

3.) The LXCs console via the PVE webUI isn't working any longer after the upgrade to Debian 12. The screen is just black. So I searched the logs if maybe the network is still waiting and hasn't timed out and indeed when starting the LXC the "systemd-networkd-wait-online.service" that is blocking:
Code:
Jul 22 21:42:00 PBS proxmox-backup-proxy[163]: lookup_datastore failed - unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - No such file or directory (os error 2)
Jul 22 21:42:00 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.42]:55082] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - >
Jul 22 21:42:01 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.42]:55094] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - >
Jul 22 21:42:02 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.42]:55122] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - >
Jul 22 21:42:03 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.42]:55150] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - >
Jul 22 21:42:03 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.22]:44348] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - >
Jul 22 21:42:07 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.5]:44580] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - N>
Jul 22 21:42:08 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.42]:33320] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - >
Jul 22 21:42:10 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.42]:33348] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - >
Jul 22 21:42:11 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.22]:33132] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - >
Jul 22 21:42:13 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.22]:33148] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - >
Jul 22 21:42:14 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.42]:33370] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - >
Jul 22 21:42:17 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.22]:33156] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - >
Jul 22 21:42:17 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.5]:45726] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - N>
Jul 22 21:42:20 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.42]:42652] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - >
Jul 22 21:42:20 PBS proxmox-backup-[163]: PBS proxmox-backup-proxy[163]: GET /api2/json/admin/datastore/PBS_DS1/status: 400 Bad Request: [client [::ffff:192.168.49.42]:42668] unable to open chunk store 'PBS_DS1' at "/mnt/pbs/.chunks" - >
Jul 22 21:42:22 PBS systemd-networkd-wait-online[67]: Timeout occurred while waiting for network connectivity.
Jul 22 21:42:22 PBS systemd[1]: systemd-networkd-wait-online.service: Main process exited, code=exited, status=1/FAILURE
Jul 22 21:42:22 PBS systemd[1]: systemd-networkd-wait-online.service: Failed with result 'exit-code'.
Jul 22 21:42:22 PBS systemd[1]: Failed to start systemd-networkd-wait-online.service - Wait for Network to be Configured.
Jul 22 21:42:22 PBS systemd[1]: Reached target network-online.target - Network is Online.
Jul 22 21:42:22 PBS systemd[1]: Mounting mnt-pbs.mount - /mnt/pbs...
Jul 22 21:42:22 PBS systemd[1]: Started filebeat.service - Filebeat sends log files to Logstash or directly to Elasticsearch..
Jul 22 21:42:22 PBS systemd[1]: Starting postfix@-.service - Postfix Mail Transport Agent (instance -)...
Jul 22 21:42:22 PBS systemd[1]: Starting rpc-statd-notify.service - Notify NFS peers of a restart...
Jul 22 21:42:22 PBS sm-notify[234]: Version 2.6.2 starting
Jul 22 21:42:22 PBS systemd[1]: Started rpc-statd-notify.service - Notify NFS peers of a restart.
Jul 22 21:42:22 PBS filebeat[231]: {"log.level":"warn","@timestamp":"2023-07-22T21:42:22.462+0200","log.origin":{"file.name":"beater/filebeat.go","file.line":175},"message":"Filebeat is unable to load the ingest pipelines for the config>
Jul 22 21:42:22 PBS postfix[292]: Postfix is using backwards-compatible default settings
Jul 22 21:42:22 PBS postfix[292]: See http://www.postfix.org/COMPATIBILITY_README.html for details
Jul 22 21:42:22 PBS postfix[292]: To disable backwards compatibility use "postconf compatibility_level=3.6" and "postfix reload"
Jul 22 21:42:22 PBS systemd[1]: Mounted mnt-pbs.mount - /mnt/pbs.
Jul 22 21:42:22 PBS systemd[1]: Reached target remote-fs.target - Remote File Systems.
As far as I understand LXCs aren't using networkd as I see these lines in /etc/systemd/system-preset/00-pve.preset created by PVE:
Code:
# Added by PVE at create-time for first-boot configuration.
disable systemd-networkd.service
But I'm not sure how I can fix this hanging systemd-networkd-wait-online.service. Looks like my network is working fine but this systemd-networkd-wait-online.service is preventing the other services to start like mounting my NFS share so PBS can't access my datastore for some minutes after starting the LXC. And I guess the black console is also caused by this, like you see it when setting up IPv6 but then not providing a IPv6 DHCP.

4.) Is there a way to limit logging of PBS to something like "warning" or higher? I couldn't find such an option neither in the PBS webUI nor in the documentation. PBS is really spamming the logs with hundred-throusands of lines like this when doing a backup or a sync:
Code:
Jul 20 15:22:59 PBS proxmox-backup-proxy[188]: GET /chunk
Jul 20 15:22:59 PBS proxmox-backup-proxy[188]: download chunk "/mnt/pbs/.chunks/7e5c/7e5cffd860dcb613c3d6785b9b0c30f229c88ddc9b9804e9d3690131ac7cd1d9
My graylog really got problems processing all those logs an writing them to the DB. I could create some exclude rules for filebeat so they won't be send to my log server but I really would prefer to not have those logs at all, so journald isn't writing to the SSDs all the time...


Edit: Looks like my LXc is still using systmd-networkd:
Code:
systemctl status systemd-networkd.service
● systemd-networkd.service - Network Configuration
     Loaded: loaded (/lib/systemd/system/systemd-networkd.service; enabled; preset: disabled)
     Active: active (running) since Sat 2023-07-22 21:40:22 CEST; 57min ago
TriggeredBy: ● systemd-networkd.socket
       Docs: man:systemd-networkd.service(8)
             man:org.freedesktop.network1(5)
   Main PID: 66 (systemd-network)
     Status: "Processing requests..."
      Tasks: 1 (limit: 115638)
     Memory: 1.5M
        CPU: 51ms
     CGroup: /system.slice/systemd-networkd.service
             └─66 /lib/systemd/systemd-networkd

Jul 22 21:40:22 PBS systemd-networkd[66]: lo: Link UP
Jul 22 21:40:22 PBS systemd-networkd[66]: lo: Gained carrier
Jul 22 21:40:22 PBS systemd-networkd[66]: Enumeration completed
Jul 22 21:40:22 PBS systemd[1]: Started systemd-networkd.service - Network Configuration.
Jul 22 21:40:32 PBS systemd-networkd[66]: eth0: Link UP
Jul 22 21:40:32 PBS systemd-networkd[66]: eth0: Gained carrier
Jul 22 21:40:33 PBS systemd-networkd[66]: eth1: Link UP
Jul 22 21:40:33 PBS systemd-networkd[66]: eth1: Gained carrier
Jul 22 21:40:33 PBS systemd-networkd[66]: eth2: Link UP
Jul 22 21:40:33 PBS systemd-networkd[66]: eth2: Gained carrier

And when running a systemctl disable systemd-networkd.service and rebooting the LXC the console is working again and NFS shares too. So I guess the upgrade from Debian 11 to 12 somehow enabled systemd-networkd again and this shouldn't be enabled?

Edit:
And looks like the ssh server is running as ssh.service instead of sshd.service like it is the case when running a Debian 12 VM?:
Code:
root@PBS:~# systemctl status sshd
Unit sshd.service could not be found.
root@PBS:~# systemctl status ssh
● ssh.service - OpenBSD Secure Shell server
     Loaded: loaded (/lib/systemd/system/ssh.service; disabled; preset: enabled)
     Active: active (running) since Sat 2023-07-22 22:45:53 CEST; 2min 27s ago
TriggeredBy: ● ssh.socket
       Docs: man:sshd(8)
             man:sshd_config(5)
    Process: 388 ExecStartPre=/usr/sbin/sshd -t (code=exited, status=0/SUCCESS)
   Main PID: 389 (sshd)
      Tasks: 1 (limit: 115638)
     Memory: 3.5M
        CPU: 69ms
     CGroup: /system.slice/ssh.service
             └─389 "sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups"

Jul 22 22:45:53 PBS sshd[389]: Server listening on :: port 22.
Jul 22 22:45:53 PBS systemd[1]: Started ssh.service - OpenBSD Secure Shell server.
Jul 22 22:45:53 PBS sshd[390]: Postponed publickey for root from 192.168.43.30 port 59064 ssh2 [preauth]
Jul 22 22:45:53 PBS sshd[390]: Accepted publickey for root from 192.168.43.30 port 59064 ssh2: ED25519 SHA256:<redacted>
Jul 22 22:45:53 PBS sshd[390]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Jul 22 22:45:53 PBS sshd[390]: pam_env(sshd:session): deprecated reading of user environment enabled

Edit:
My other PBS LXC, that is still on Debian 11 + PBS2 also got the systemd-networkd.serivce running but there it isn't a problem with systemd-networkd-wait-online.service timing out.

Would be nice to know how this is supposed to be.

A.) should systemd-networkd.service be disabled?
B.) should systemd-networkd.service be enabled but systemd-networkd-wait-online.service?
C.) should both be enabled but there might be some fix so systemd-networkd-wait-online.service won't get stuck?
 
Last edited:
  • Like
Reactions: AbsolutelyFree
@tteckster:
Looks like you also got a problem with the systemd-networkd-wait-online.service as you tell to disable it when upgrading a Debina 11 LXC to Debian 12?:
https://github.com/tteck/Proxmox/discussions/1498 said:
Updating the LXC to Debian 12 manually could potentially cause a loss of functionality.
Here's the approach I took to upgrade mine, but remember to have backups as a precautionary measure.

Copy & Paste in the LXC console

cat <<EOF >/etc/apt/sources.list
deb http://ftp.debian.org/debian bookworm main contrib
deb http://ftp.debian.org/debian bookworm-updates main contrib
deb http://security.debian.org/debian-security bookworm-security main contrib
EOF
apt-get update
DEBIAN_FRONTEND=noninteractive apt-get -o Dpkg::Options::="--force-confold" dist-upgrade -y
systemctl disable -q --now systemd-networkd-wait-online.service
rm -rf /usr/lib/python3.*/EXTERNALLY-MANAGED
apt-get autoremove --purge -y
 
@tteckster:
Looks like you also got a problem with the systemd-networkd-wait-online.service as you tell to disable it when upgrading a Debina 11 LXC to Debian 12?:
The bridge interface seems to be causing a failure in `systemd-networkd-wait-online`, resulting in a two-minute boot pause.
 
  • Like
Reactions: Dunuin
So I created a new Debian 12 LXC:

Code:
root@testlxc:~# systemctl status systemd-networkd.service
* systemd-networkd.service - Network Configuration
     Loaded: loaded (/lib/systemd/system/systemd-networkd.service; disabled; preset: disabled)
     Active: active (running) since Tue 2023-07-25 20:09:28 UTC; 32s ago
TriggeredBy: * systemd-networkd.socket
       Docs: man:systemd-networkd.service(8)
   Main PID: 105 (systemd-network)
     Status: "Processing requests..."
      Tasks: 1 (limit: 38373)
     Memory: 2.1M
        CPU: 40ms
     CGroup: /system.slice/systemd-networkd.service
             `-105 /lib/systemd/systemd-networkd

Jul 25 20:09:28 testlxc systemd-networkd[105]: Failed to increase receive buffer size for general netlink socket, ignoring: Operation not permitted
Jul 25 20:09:28 testlxc systemd-networkd[105]: eth0: Link UP
Jul 25 20:09:28 testlxc systemd-networkd[105]: eth0: Gained carrier
Jul 25 20:09:28 testlxc systemd-networkd[105]: lo: Link UP
Jul 25 20:09:28 testlxc systemd-networkd[105]: lo: Gained carrier
Jul 25 20:09:28 testlxc systemd-networkd[105]: Enumeration completed
Jul 25 20:09:28 testlxc systemd[1]: Started systemd-networkd.service - Network Configuration.
Jul 25 20:09:28 testlxc systemd-networkd[105]: eth0: Lost carrier
Jul 25 20:09:28 testlxc systemd-networkd[105]: eth0: Gained carrier
Jul 25 20:09:30 testlxc systemd-networkd[105]: eth0: Gained IPv6LL

Code:
root@testlxc:~# systemctl status systemd-networkd-wait-online.service
* systemd-networkd-wait-online.service - Wait for Network to be Configured
     Loaded: loaded (/lib/systemd/system/systemd-networkd-wait-online.service; disabled; preset: disabled)
     Active: inactive (dead)
       Docs: man:systemd-networkd-wait-online.service(8)
Code:
root@testlxc:~# systemctl list-unit-files | grep network
UNIT FILE                              STATE           PRESET
networking.service                     enabled         enabled
systemd-network-generator.service      enabled         enabled
systemd-networkd-wait-online.service   disabled        disabled
systemd-networkd-wait-online@.service  disabled        enabled
systemd-networkd.service               disabled        disabled
systemd-networkd.socket                enabled         enabled
network-online.target                  static          -
network-pre.target                     static          -
network.target                         static          -

So looks like a Debian 12 LXC got systemd-networkd.service disabled and systemd-networkd-wait-online.service disabled by default but the systemd-networkd.service is stull running...why?