How can I wait for an online network on my vm?

Keyinator

Member
Jan 29, 2022
26
1
6
22
Hello everyone,

I have a service on my vm that should only start once ens18 (the connection to ethernet) is online and working.

I've tried two solutions:
Code:
[Unit]
Description=Fivem Server
Requires=systemd-networkd-wait-online
After=systemd-networkd-wait-online

[Service]
Type=simple
ExecStartPre=/usr/lib/systemd/systemd-networkd-wait-online -i ens18 -o routable
ExecStart=//START SOMETHING
User=fivem

[Install]
WantedBy=multi-user.target

Code:
[Unit]
Description=Fivem Server
BindsTo=sys-devices-pci0000:00-0000:00:12.0-virtio2-net-ens18.device
After=sys-devices-pci0000:00-0000:00:12.0-virtio2-net-ens18.device

[Service]
Type=simple
ExecStart=//START SOMETHING
User=fivem

[Install]
WantedBy=multi-user.target

However, the started service always reports CURL error code 6 (Couldn't resolve host name after a reboot.
Is there anything I could do?
 
Almost anything inside VM is out of scope for a "PVE installation and configuration" forum. A generic Linux OS site like stackexchange would have this topic covered couple of dozen times.

Since you havent specified what flavor of Linux you are running, here is a Debian search string that might help:
"debian start service after network"

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
https://unix.stackexchange.com/ques...cript-to-execute-after-networking-has-started


Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Almost anything inside VM is out of scope for a "PVE installation and configuration" forum. A generic Linux OS site like stackexchange would have this topic covered couple of dozen times.

Since you havent specified what flavor of Linux you are running, here is a Debian search string that might help:
"debian start service after network"

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
https://unix.stackexchange.com/ques...cript-to-execute-after-networking-has-started


Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Thanks for the response,

Since you havent specified what flavor of Linux you are running
I am using Ubuntu as an OS on my vm.

A generic Linux OS site like stackexchange would have this topic covered couple of dozen times.
If you'd actually take a look at the code I provided, you should see that I already tried a tighter variation of the solutions you provided.

I posted the issue here since the vm seems to not be the problem here.

My network config may be helpful so I'll also add it:
Code:
auto enp9s0
iface enp9s0 inet static
  address  <MAIN IP>/26
  gateway  <GW>

iface enp9s0 inet6 static
  address  <MAIN IPv6>
  netmask  128
  gateway  <GWv6>

auto vmbr0
iface vmbr0 inet static
  address  10.0.0.1/24
  bridge-ports none
  bridge-stp off
  bridge-fd 0
  post-up   echo 1 > /proc/sys/net/ipv4/ip_forward
  post-up   iptables -t nat -A POSTROUTING -s '10.0.0.0/24' -o enp9s0 -j MASQUERADE
  post-down iptables -t nat -D POSTROUTING -s '10.0.0.0/24' -o enp9s0 -j MASQUERADE
  up ip route add <SECOND IP>/32 dev vmbr0

  post-up iptables -t nat -A PREROUTING  -p tcp --dport 30122 -d <SECOND IP> -j DNAT --to-destination 10.0.0.42:30120
  post-up iptables -t nat -A POSTROUTING -p tcp --sport 30120 -s 10.0.0.42   -j SNAT --to-source <SECOND IP>:30122
  post-up iptables -t nat -A PREROUTING  -p udp --dport 30122 -d <SECOND IP> -j DNAT --to-destination 10.0.0.42:30120
  post-up iptables -t nat -A POSTROUTING -p udp --sport 30120 -s 10.0.0.42   -j SNAT --to-source <SECOND IP>:30122

  #OUTBOUND CONNECTIONS
  post-up iptables -t nat -A POSTROUTING -p tcp -s 10.0.0.42 -j SNAT --to-source 162.55.98.16
Also I forgot to define the "restart". The proxmox node is up and running and has a connection. When I restart the vm using sudo reboot and the service is started, the state of the network (ens18) is routable and configured. However, it takes a couple more seconds to be available apparently.

Could this have to do with the node?
 
The new data you provided shows that you have more than the basic configuration that most people are dealing with.

You have not mentioned it, but I suspect the service can be started manually some time after boot? If so, that indicates there is a race condition between when the VM thinks things are online and when they are truly online.

The range of causes could be quiet large, ie your hypervisor is stressed and things are not programmed as fast as you would normally expect. Or based on the actual error, the access to DNS is not immediately available.

You have two main options:
- try to troubleshoot what causes the race and the time window where that happens. You may find out its completely external and out of your control on VM. A simple way to reduce the number of variables is to create basic service that pings by IP and Name instead of the full blown "fivem".
- A faster way to resolution would be to simply add restart to the service and let it work through whatever delay exists.



Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:
  • Like
Reactions: Keyinator
You have not mentioned it, but I suspect the service can be started manually some time after boot? If so, that indicates there is a race condition between when the VM thinks things are online and when they are truly online.
Yes. I think that is the cause.

The range of causes could be quiet large, ie your hypervisor is stressed and things are not programmed as fast as you would normally expect. Or based on the actual error, the access to DNS is not immediately available.
I don't think it is stress as the server runs relatively chill (15% cpu, 0.4% io delay, nvme ssd)
However, I do suspect that the nic's drivers might be causing this issue. Running lspci gives me this result:
Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)

Either way since the issue is really difficult to debug and I don't really want to risk messing it up further, I have added ExecStartPre=/bin/sleep 10 to the service.
Just making it restart on fail does not work since it throws no error but instead displays it to a console web interface :/
 
I agree that load is an unlikely cause, just an extreme example of possibilities.

We used to deal with some Mellanox cards where driver/firmware initialization was delayed and solved it by using LINKDELAY=xx option in ifcfg file. So similar to your solution. However, I am not sure this option is still available in semi-modern distros.


Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!