Hi,
I have a very strange network issue which I cannot solve. I'm looking for help in this forum hoping somebody has any idea what the cause could be. I'm not a Proxmox expert nor a networking expert but I have some networking knowledge and Proxmox experience.
My problem is that I have a shell script running that pings a specified IP address and logs errors if something is wrong. I can run multiple instances of this script in the background to check multiple IP addresses. The script has 2 parameters: IP address and ping wait time. I run it with "nohup ./watch_ip.sh IP_ADDRESS WAIT_TIME &". So for example I run "nohup ./watch_ip.sh 192.168.1.1 1 &". So far so good.
Now the strange issue: When I run this script on my Proxmox 8.2.4 host with specific IP addresses (192.168.1.250, 192.168.1.204, 192.168.1.205) the log file fills up with errors. When I run the same script with the same IP addresses from a LXC container on the same host no errors are logged ! What ???
Furthermore when I use other IP addresses on the PM host (i.e. 192.168.1.1) then no error is logged. Strange, isn't it?
The problematic IP addresses belong to a Fritzbox 7490 with 2 IP cameras connected. But again running the script from a container is no problem, just from the PM host. Also running the same script on another host on the same network is OK.
Notes:
- I disabled PCIe power management with kernel parameter "pcie_port_pm=off". This didn't help to solve the issue but I still left it active.
- I changed the PM host LAN cable which didn't help.
- dmesg logs several errors that might be related but I don't understand them.
System infos below and attached:
PM Host HW: Beelink Mini PC, 12th Gen Intel Alder Lake-N100 Prozessor (bis zu 3.40GHz), EQ12 Office Mini Computer, 8GB DDR5 500GB SSD Mini Desktop PC, Dual 2.5G Ethernet/Dual HDMI/WiFi 6/WOL/Auto Power On
ethtool output:
root@pve:~# ethtool enp2s0
Settings for enp2s0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Speed: 1000Mb/s
Duplex: Full
Auto-negotiation: on
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
MDI-X: off (auto)
Supports Wake-on: pumbg
Wake-on: g
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
This is the script:
PM host dmesg output attached.
I have a very strange network issue which I cannot solve. I'm looking for help in this forum hoping somebody has any idea what the cause could be. I'm not a Proxmox expert nor a networking expert but I have some networking knowledge and Proxmox experience.
My problem is that I have a shell script running that pings a specified IP address and logs errors if something is wrong. I can run multiple instances of this script in the background to check multiple IP addresses. The script has 2 parameters: IP address and ping wait time. I run it with "nohup ./watch_ip.sh IP_ADDRESS WAIT_TIME &". So for example I run "nohup ./watch_ip.sh 192.168.1.1 1 &". So far so good.
Now the strange issue: When I run this script on my Proxmox 8.2.4 host with specific IP addresses (192.168.1.250, 192.168.1.204, 192.168.1.205) the log file fills up with errors. When I run the same script with the same IP addresses from a LXC container on the same host no errors are logged ! What ???
Furthermore when I use other IP addresses on the PM host (i.e. 192.168.1.1) then no error is logged. Strange, isn't it?
The problematic IP addresses belong to a Fritzbox 7490 with 2 IP cameras connected. But again running the script from a container is no problem, just from the PM host. Also running the same script on another host on the same network is OK.
Notes:
- I disabled PCIe power management with kernel parameter "pcie_port_pm=off". This didn't help to solve the issue but I still left it active.
- I changed the PM host LAN cable which didn't help.
- dmesg logs several errors that might be related but I don't understand them.
System infos below and attached:
PM Host HW: Beelink Mini PC, 12th Gen Intel Alder Lake-N100 Prozessor (bis zu 3.40GHz), EQ12 Office Mini Computer, 8GB DDR5 500GB SSD Mini Desktop PC, Dual 2.5G Ethernet/Dual HDMI/WiFi 6/WOL/Auto Power On
ethtool output:
root@pve:~# ethtool enp2s0
Settings for enp2s0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Speed: 1000Mb/s
Duplex: Full
Auto-negotiation: on
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
MDI-X: off (auto)
Supports Wake-on: pumbg
Wake-on: g
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
This is the script:
Bash:
#!/bin/bash
# check and log if a host is reachable by ping
#CONFIGURATION
#IP of host
WATCH_IP=$1
# Time to wait for response in sec
WAIT_TIME=$2
#path to logfile
LOGFILE="/var/log/watchip-$1.log"
#duration between pings
PAUSE=1
#how many failed pings before log
TESTS=1
#SCRIPT
#initialize
MISSED=0
touch $LOGFILE
while true; do
if ! ping -c 1 -W $WAIT_TIME $WATCH_IP > /dev/null; then
((MISSED++))
else
if [ $MISSED -ge $TESTS ]; then
echo `date` '-' $WATCH_IP "is up again." >> $LOGFILE;
fi
MISSED=0
fi;
if [ $MISSED -eq $TESTS ]; then
echo `date` "-" $WATCH_IP "is down." >> $LOGFILE;
fi
sleep $PAUSE;
done
PM host dmesg output attached.