Hi all,
Since Version 6.0 up to now Version 6.2 we see the follwoing behavior running backups over WAN to NFS.
We have a 8 hosts cluster (all HP DL380 G7 up to W9) runnung fine.
When doing backups over a WAN connection to a QNAP we first we see al lot of this
May 21 22:39:03 pve56 kernel: [210590.778116] rpc_check_timeout: 247 callbacks suppressed
May 21 22:39:03 pve56 kernel: [210590.778117] nfs: server XXX.XXX.XXX.XXX not responding, still trying
May 21 22:39:03 pve56 kernel: [210590.798205] nfs: server XXX.XXX.XXX.XXX not responding, still trying
This is some how expected because we only can use a 100 MBit connectin.
But the watchdog times out and the host is restartet.
May 21 22:39:05 pve56 watchdog-mux[1323]: client watchdog expired - disable watchdog updates
This is reproducable and only happens while running backups over NFS. This happens an all nodes while getting timeouts on NFS.
This are the package versions we are using:
proxmox-ve: 6.2-1 (running kernel: 5.4.41-1-pve) pve-manager: 6.2-4 (running version: 6.2-4/9824574a) pve-kernel-5.4: 6.2-2 pve-kernel-helper: 6.2-2 pve-kernel-5.3: 6.1-6 pve-kernel-5.0: 6.0-11 pve-kernel-5.4.41-1-pve: 5.4.41-1 pve-kernel-5.4.34-1-pve: 5.4.34-2 pve-kernel-5.3.18-3-pve: 5.3.18-3 pve-kernel-5.3.18-2-pve: 5.3.18-2 pve-kernel-5.0.21-5-pve: 5.0.21-10 pve-kernel-5.0.15-1-pve: 5.0.15-1 ceph: 14.2.9-pve1 ceph-fuse: 14.2.9-pve1 corosync: 3.0.3-pve1 criu: 3.11-3 glusterfs-client: 5.5-3 ifupdown: 0.8.35+pve1 ksm-control-daemon: 1.3-1 libjs-extjs: 6.0.1-10 libknet1: 1.15-pve1 libproxmox-acme-perl: 1.0.4 libpve-access-control: 6.1-1 libpve-apiclient-perl: 3.0-3 libpve-common-perl: 6.1-2 libpve-guest-common-perl: 3.0-10 libpve-http-server-perl: 3.0-5 libpve-storage-perl: 6.1-8 libqb0: 1.0.5-1 libspice-server1: 0.14.2-4~pve6+1 lvm2: 2.03.02-pve4 lxc-pve: 4.0.2-1 lxcfs: 4.0.3-pve2 novnc-pve: 1.1.0-1 proxmox-mini-journalreader: 1.1-1 proxmox-widget-toolkit: 2.2-1 pve-cluster: 6.1-8 pve-container: 3.1-6 pve-docs: 6.2-4 pve-edk2-firmware: 2.20200229-1 pve-firewall: 4.1-2 pve-firmware: 3.1-1 pve-ha-manager: 3.0-9 pve-i18n: 2.1-2 pve-qemu-kvm: 5.0.0-2 pve-xtermjs: 4.3.0-1 qemu-server: 6.2-2 smartmontools: 7.1-pve2 spiceterm: 3.1-1 vncterm: 1.6-1 zfsutils-linux: 0.8.4-pve1
This is the storage configuration in storage.cfg:
nfs: backup
export /volume1/backup2
path /mnt/pve/backupKI
server XXX.XXX.XXX.XXX
content backup,vztmpl
maxfiles 7
options vers=3
Any hints how to set nfs timo in configuration?
Regards Lukas
Since Version 6.0 up to now Version 6.2 we see the follwoing behavior running backups over WAN to NFS.
We have a 8 hosts cluster (all HP DL380 G7 up to W9) runnung fine.
When doing backups over a WAN connection to a QNAP we first we see al lot of this
May 21 22:39:03 pve56 kernel: [210590.778116] rpc_check_timeout: 247 callbacks suppressed
May 21 22:39:03 pve56 kernel: [210590.778117] nfs: server XXX.XXX.XXX.XXX not responding, still trying
May 21 22:39:03 pve56 kernel: [210590.798205] nfs: server XXX.XXX.XXX.XXX not responding, still trying
This is some how expected because we only can use a 100 MBit connectin.
But the watchdog times out and the host is restartet.
May 21 22:39:05 pve56 watchdog-mux[1323]: client watchdog expired - disable watchdog updates
This is reproducable and only happens while running backups over NFS. This happens an all nodes while getting timeouts on NFS.
This are the package versions we are using:
proxmox-ve: 6.2-1 (running kernel: 5.4.41-1-pve) pve-manager: 6.2-4 (running version: 6.2-4/9824574a) pve-kernel-5.4: 6.2-2 pve-kernel-helper: 6.2-2 pve-kernel-5.3: 6.1-6 pve-kernel-5.0: 6.0-11 pve-kernel-5.4.41-1-pve: 5.4.41-1 pve-kernel-5.4.34-1-pve: 5.4.34-2 pve-kernel-5.3.18-3-pve: 5.3.18-3 pve-kernel-5.3.18-2-pve: 5.3.18-2 pve-kernel-5.0.21-5-pve: 5.0.21-10 pve-kernel-5.0.15-1-pve: 5.0.15-1 ceph: 14.2.9-pve1 ceph-fuse: 14.2.9-pve1 corosync: 3.0.3-pve1 criu: 3.11-3 glusterfs-client: 5.5-3 ifupdown: 0.8.35+pve1 ksm-control-daemon: 1.3-1 libjs-extjs: 6.0.1-10 libknet1: 1.15-pve1 libproxmox-acme-perl: 1.0.4 libpve-access-control: 6.1-1 libpve-apiclient-perl: 3.0-3 libpve-common-perl: 6.1-2 libpve-guest-common-perl: 3.0-10 libpve-http-server-perl: 3.0-5 libpve-storage-perl: 6.1-8 libqb0: 1.0.5-1 libspice-server1: 0.14.2-4~pve6+1 lvm2: 2.03.02-pve4 lxc-pve: 4.0.2-1 lxcfs: 4.0.3-pve2 novnc-pve: 1.1.0-1 proxmox-mini-journalreader: 1.1-1 proxmox-widget-toolkit: 2.2-1 pve-cluster: 6.1-8 pve-container: 3.1-6 pve-docs: 6.2-4 pve-edk2-firmware: 2.20200229-1 pve-firewall: 4.1-2 pve-firmware: 3.1-1 pve-ha-manager: 3.0-9 pve-i18n: 2.1-2 pve-qemu-kvm: 5.0.0-2 pve-xtermjs: 4.3.0-1 qemu-server: 6.2-2 smartmontools: 7.1-pve2 spiceterm: 3.1-1 vncterm: 1.6-1 zfsutils-linux: 0.8.4-pve1
This is the storage configuration in storage.cfg:
nfs: backup
export /volume1/backup2
path /mnt/pve/backupKI
server XXX.XXX.XXX.XXX
content backup,vztmpl
maxfiles 7
options vers=3
Any hints how to set nfs timo in configuration?
Regards Lukas