Error: Message too long in External Metric

Linyu

Well-Known Member
Jun 14, 2019
43
2
48
25
Hi Forum,
I am glad that using proxmox with influxdb to upload metric data has help me alot, and I love it.
But I've got errors when I was setting up one of the servers, which has a lots of VM and lxc running on that node.
QQ截图20200204032340.png

The log of pvestatd shows me:
failed to send metrics: Message too long
What can I do to solve that problem?
Thanks!

I have google it and I found that udp can only send 64kb data at one time, maybe that metrics data needs to be sent has larger than this limit, can it be solved?
 
mhmm... yes this seems like a bug
please open one here: https://bugzilla.proxmox.com/

and include the part of the syslog, and maybe how many vms/cts/nodes/storages you have
 
I'm also seeing this in my syslog every 10mins as follows...

Jun 17 09:15:19 pve-host1 pvestatd[2377]: qemu status update error: metrics send error 'influxdb': failed to send metrics: Message too long

I couldn't find any bug raised about this as suggested above so have raised the followng bug:

https://bugzilla.proxmox.com/show_bug.cgi?id=2802
 
Hi @t.lamprecht

I'm at the latest version as follows so assume there's maybe another issue here that wasn't fixed by the previous fix?

Code:
root@pve-host1:~# pveversion -v
proxmox-ve: 6.2-1 (running kernel: 5.4.41-1-pve)
pve-manager: 6.2-6 (running version: 6.2-6/ee1d7754)
pve-kernel-5.4: 6.2-2
pve-kernel-helper: 6.2-2
pve-kernel-5.3: 6.1-6
pve-kernel-5.0: 6.0-11
pve-kernel-5.4.41-1-pve: 5.4.41-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libproxmox-acme-perl: 1.0.4
libpve-access-control: 6.1-1
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-3
libpve-guest-common-perl: 3.0-10
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-8
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve2
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-7
pve-cluster: 6.1-8
pve-container: 3.1-8
pve-docs: 6.2-4
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.1-2
pve-firmware: 3.1-1
pve-ha-manager: 3.0-9
pve-i18n: 2.1-3
pve-qemu-kvm: 5.0.0-4
pve-xtermjs: 4.3.0-1
qemu-server: 6.2-3
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.4-pve1

Thanks!
 
I'm seeing the same errors today all these months later, also on latest -

Code:
root@NUC10i3FNH-2:~# pveversion -v
proxmox-ve: 6.2-2 (running kernel: 5.4.65-1-pve)
pve-manager: 6.2-12 (running version: 6.2-12/b287dd27)
pve-kernel-5.4: 6.2-7
pve-kernel-helper: 6.2-7
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.60-1-pve: 5.4.60-2
pve-kernel-5.4.34-1-pve: 5.4.34-2
<snip>

Code:
root@NUC10i3FNH-2:~# cat /etc/pve/status.cfg
influxdb:
    server 192.168.0.51
    port 8089
root@NUC10i3FNH-2:~#

Code:
    "syslog_pri": "27",
    "syslog_facility": "daemon",
    "syslog_severity_code": 3,
    "message": "<27>Oct 27 22:49:08 NUC10i3FNH-2 pvestatd[1302]: qemu status update error: metrics send error 'influxdb': failed to send metrics: Message too long",
    "received_from": "192.168.0.11",
    "syslog_timestamp": "Oct 27 22:49:08",
 
could you post the output of ip l (feel free to censor the MACs) and ip r get 192.168.0.51?
 
Hi @fabian ,

I raised the bug originally over at bugzilla for this so thought I'd put some more observations here on this.

I have 3 PVE nodes (pve-host1, pve-host2 and pve-host3 all latest version). The issue now occurs on only pve-host2 and pve-host3 because I made the change you suggested to the file /usr/share/perl5/PVE/Status/Plugin.pm on pve-host1 as follows:

#return 1450; # assume 1500 MTU, empty IPv6 UDP packet needs 48 bytes overhead
return 1400; # assume 1450 MTU, empty IPv6 UDP packet needs 48 bytes overhead

The above indicates that PVE expects the MTU to be 1500 and it is set to 1500 on all my nodes as indicated by the following from pve-host2:

root@pve-host2:~# ip l
root@pve-host2:~# ip l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
3: wlp0s20f3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
4: enx000ec6b7a093: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
5: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
6: tap122i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr122i0 state UNKNOWN mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
7: fwbr122i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
8: fwpr122p0@fwln122i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
9: fwln122i0@fwpr122p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr122i0 state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
10: tap112i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr112i0 state UNKNOWN mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
11: fwbr112i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
12: fwpr112p0@fwln112i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
13: fwln112i0@fwpr112p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr112i0 state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
14: tap132i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr132i0 state UNKNOWN mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
15: fwbr132i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
16: fwpr132p0@fwln132i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
17: fwln132i0@fwpr132p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr132i0 state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
26: tap110i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr110i0 state UNKNOWN mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
27: fwbr110i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
28: fwpr110p0@fwln110i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
29: fwln110i0@fwpr110p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr110i0 state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
62: tap107i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr107i0 state UNKNOWN mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
63: fwbr107i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
64: fwpr107p0@fwln107i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
65: fwln107i0@fwpr107p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr107i0 state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
66: tap109i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr109i0 state UNKNOWN mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
67: fwbr109i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
68: fwpr109p0@fwln109i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
69: fwln109i0@fwpr109p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr109i0 state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff

Code:
root@pve-host2:~# ip r get 192.168.1.210
192.168.1.210 dev vmbr0 src 192.168.1.202 uid 0
    cache expires 504sec mtu 1450

So if all is good at the PVE end, I checked the InfluxDB end. I run InfluxDB as a docker container on Docker Swarm. The InfluxDB IP used by PVE is 192.168.1.210 which is a keepalived VIP which points to either docker-1, docker-2 or docker-3 and the InfluxDB docker service can run on any of the above docker-X machines.

If I run the above commands on docker-1 (where the InfluxDB service happened to be running when I checked), I get the following which indicates all the docker-1 NICs are set to MTU=1500 correctly:

root@docker-1:~# ip l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
12: docker_gwbridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether XX:XX:XX:XX:XX:XX brd ff:ff:ff:ff:ff:ff
14: vethf5e252b@if13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP mode DEFAULT group default
link/ether XX:XX:XX:XX:XX:XX :ff:ff link-netnsid 3
18: vethffcc101@if17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP mode DEFAULT group default
link/ether XX:XX:XX:XX:XX:XX :ff:ff link-netnsid 4
57: veth8e71dda@if56: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP mode DEFAULT group default
link/ether XX:XX:XX:XX:XX:XX ff:ff link-netnsid 13
117: veth3fdf846@if116: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP mode DEFAULT group default
link/ether XX:XX:XX:XX:XX:XX :ff:ff link-netnsid 9
121: veth73e96b8@if120: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP mode DEFAULT group default
link/ether XX:XX:XX:XX:XX:XX ff:ff link-netnsid 11
127: vetha7bbb89@if126: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP mode DEFAULT group default
link/ether XX:XX:XX:XX:XX:XX :ff:ff link-netnsid 8
133: veth872fe1f@if132: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP mode DEFAULT group default
link/ether XX:XX:XX:XX:XX:XX :ff:ff link-netnsid 7
137: vethcf11575@if136: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP mode DEFAULT group default
link/ether XX:XX:XX:XX:XX:XX ff:ff link-netnsid 12
149: vethaf889e8@if148: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP mode DEFAULT group default
link/ether XX:XX:XX:XX:XX:XX ff:ff link-netnsid 15
185: veth837206d@if184: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP mode DEFAULT group default
link/ether XX:XX:XX:XX:XX:XX ff:ff link-netnsid 10
191: vethf08ce75@if190: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP mode DEFAULT group default
link/ether XX:XX:XX:XX:XX:XX ff:ff link-netnsid 14

However... If I then check the same command inside the InfluxDB service/container, I get the following indicating 2 of the docker networks used by the InfluxDB service actually have an MTU of 1450 set?!?

Code:
root@d022f7fd2ead:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
118: eth0@if119: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
link/ether XX:XX:XX:XX:XX:XX :ff:ff link-netnsid 0
inet 10.0.0.54/24 brd 10.0.0.255 scope global eth0
valid_lft forever preferred_lft forever
120: eth2@if121: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether XX:XX:XX:XX:XX:XX :ff:ff link-netnsid 2
inet 172.18.0.8/16 brd 172.18.255.255 scope global eth2
valid_lft forever preferred_lft forever
122: eth1@if123: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
link/ether XX:XX:XX:XX:XX:XX :ff:ff link-netnsid 1
inet 10.0.5.64/24 brd 10.0.5.255 scope global eth1
valid_lft forever preferred_lft forever

Both the docker networks above were created with default settings so I suspect that there is an issue in docker here rather than PVE. That being said, I still think it would be beneficial to have specify the MTU in the PVE status.cfg file to override the default hardcoded value of 1500?

@Hyacin - Do you also run the target InfluxDB instance in docker and, if so, can you confirm the above matches your setup?
 
could you post the output of ip l (feel free to censor the MACs) and ip r get 192.168.0.51?
Code:
root@NUC10i3FNH-2:~# ip l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 1c:69:7a:62:2b:67 brd ff:ff:ff:ff:ff:ff
3: enx0024321743ee: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 1c:69:7a:62:2b:67 brd ff:ff:ff:ff:ff:ff
4: wlp0s20f3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether d8:3b:bf:98:97:70 brd ff:ff:ff:ff:ff:ff
5: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether 1c:69:7a:62:2b:67 brd ff:ff:ff:ff:ff:ff
6: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 1c:69:7a:62:2b:67 brd ff:ff:ff:ff:ff:ff
7: vmbr0.1@vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 1c:69:7a:62:2b:67 brd ff:ff:ff:ff:ff:ff
8: vmbr0.10@vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 1c:69:7a:62:2b:67 brd ff:ff:ff:ff:ff:ff
9: vmbr0.13@vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 1c:69:7a:62:2b:67 brd ff:ff:ff:ff:ff:ff
10: vmbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 9a:b9:2a:d0:1d:42 brd ff:ff:ff:ff:ff:ff
11: tap131i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast master fwbr131i0 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether 1a:e3:ea:ee:94:95 brd ff:ff:ff:ff:ff:ff
12: fwbr131i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 46:81:01:6e:63:28 brd ff:ff:ff:ff:ff:ff
13: fwpr131p0@fwln131i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether ca:f4:3d:42:f0:5d brd ff:ff:ff:ff:ff:ff
14: fwln131i0@fwpr131p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master fwbr131i0 state UP mode DEFAULT group default qlen 1000
    link/ether 46:81:01:6e:63:28 brd ff:ff:ff:ff:ff:ff
15: tap131i1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast master vmbr1 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether 9a:b9:2a:d0:1d:42 brd ff:ff:ff:ff:ff:ff
16: tap192i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast master fwbr192i0 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether 9e:f6:e2:12:b8:8a brd ff:ff:ff:ff:ff:ff
17: fwbr192i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 62:0f:6a:06:99:df brd ff:ff:ff:ff:ff:ff
18: fwpr192p0@fwln192i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether 42:8e:6e:6c:d8:7f brd ff:ff:ff:ff:ff:ff
19: fwln192i0@fwpr192p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master fwbr192i0 state UP mode DEFAULT group default qlen 1000
    link/ether 62:0f:6a:06:99:df brd ff:ff:ff:ff:ff:ff
20: tap192i1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast master vmbr1 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether 9a:43:bf:b2:94:d5 brd ff:ff:ff:ff:ff:ff
21: tap202i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast master fwbr202i0 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether f6:96:48:a0:ba:13 brd ff:ff:ff:ff:ff:ff
22: fwbr202i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 96:1a:4e:5c:5a:e3 brd ff:ff:ff:ff:ff:ff
23: fwpr202p0@fwln202i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether 5e:2c:90:c1:af:c7 brd ff:ff:ff:ff:ff:ff
24: fwln202i0@fwpr202p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master fwbr202i0 state UP mode DEFAULT group default qlen 1000
    link/ether 96:1a:4e:5c:5a:e3 brd ff:ff:ff:ff:ff:ff
25: tap129i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast master fwbr129i0 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether be:74:e9:2a:c9:f6 brd ff:ff:ff:ff:ff:ff
26: fwbr129i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 2e:bf:cb:9a:ef:91 brd ff:ff:ff:ff:ff:ff
27: fwpr129p0@fwln129i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether a6:51:95:c1:97:10 brd ff:ff:ff:ff:ff:ff
28: fwln129i0@fwpr129p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master fwbr129i0 state UP mode DEFAULT group default qlen 1000
    link/ether 2e:bf:cb:9a:ef:91 brd ff:ff:ff:ff:ff:ff
29: tap129i1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast master fwbr129i1 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether 86:de:15:f7:5c:a5 brd ff:ff:ff:ff:ff:ff
30: fwbr129i1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 7a:04:b2:d0:30:30 brd ff:ff:ff:ff:ff:ff
31: fwpr129p1@fwln129i1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether de:78:dd:aa:dd:1d brd ff:ff:ff:ff:ff:ff
32: fwln129i1@fwpr129p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master fwbr129i1 state UP mode DEFAULT group default qlen 1000
    link/ether 7a:04:b2:d0:30:30 brd ff:ff:ff:ff:ff:ff
33: tap129i2: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast master vmbr1 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether f6:8d:08:0f:b0:e5 brd ff:ff:ff:ff:ff:ff
root@NUC10i3FNH-2:~#

Code:
root@NUC10i3FNH-2:~# ip r get 192.168.0.51
192.168.0.51 dev vmbr0.1 src 192.168.0.11 uid 0
    cache expires 33sec mtu 1450
root@NUC10i3FNH-2:~#

192.168.0.51 is a keepalived address on my Docker Swarm, that could sometimes be active on this host, and sometimes be active on another host, depending who the keepalived master is and where that VM is running at present - in case that matters :)
 
@Hyacin - Do you also run the target InfluxDB instance in docker and, if so, can you confirm the above matches your setup?

My setup matches almost exactly, that's funny (and maybe the root of our problem!), and yes, same deal with the MTUs in the container, interesting -

Code:
bash-5.0# ip a | grep -i mtu
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
8863: eth1@if8864: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP
8865: eth2@if8866: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
8867: eth0@if8868: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP
bash-5.0#

On the host the container lives on though -

Code:
dockernode-4:~# ip a | grep -i mtu
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast state UP qlen 1000
4: docker_gwbridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
11: veth828a114@if10: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue master docker_gwbridge state UP
9339: vethfb13587@if9338: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue master docker_gwbridge state UP
8866: vethfc78276@if8865: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue master docker_gwbridge state UP
dockernode-4:~#
 
Looks like @tom has applied a patch for this to enable the MTU to be configured in /etc/pve/status.cfg in a future release of PVE. So we will have to wait for that and set the MTU to be 1450 and not the default of 1500.
Thanks @tom and @fabian for sorting this!

https://bugzilla.proxmox.com/show_bug.cgi?id=2802#c14
Won't that make the box drop any packets that come in between 1450 - 1500? Which could be important cluster traffic, or important VM stuff (NFS, iSCSI) ... etc.?

I think unless one is willing to make their entire network 1450, probably best to wait if that is the only fix, no?

Edit: I just handpatched those three modules, not hard at all. I'm not getting the error anymore from each node as I do it, but I also don't have data showing up in Influx yet. Patching all my boxes first before I dig much deeper.
 
Last edited:
As I recall, you’ll need to run systemctl restart pvestatd after making the change on each box. I’ll wait for the patch before making the changes.
 
Won't that make the box drop any packets that come in between 1450 - 1500? Which could be important cluster traffic, or important VM stuff (NFS, iSCSI) ... etc.?

I think unless one is willing to make their entire network 1450, probably best to wait if that is the only fix, no?

Edit: I just handpatched those three modules, not hard at all. I'm not getting the error anymore from each node as I do it, but I also don't have data showing up in Influx yet. Patching all my boxes first before I dig much deeper.

if by 'that' you mean the patch, then no. it just starts splitting up the metrics sooner so they can get sent instead of being dropped by the kernel for being too big. it has no effect on other traffic besides the metrics sending. whether your setup in general works or no is not for me to judge ;)
 
if by 'that' you mean the patch, then no. it just starts splitting up the metrics sooner so they can get sent instead of being dropped by the kernel for being too big. it has no effect on other traffic besides the metrics sending. whether your setup in general works or no is not for me to judge ;)
No, I did not mean the patch ... patch is awesome, thanks - everything is working great for me now.

I meant changing the link MTU to 1450 on the Proxmox box just to fix this reporting issue ... that is the "that" that I was saying could likely have additional (and very difficult and confusing to troubleshoot) side effects if you didn't change that entire network segment to 1450 as well :-)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!